s's picture

s

Tom-Neverwinter

AI & ML interests

Making improvements to help the world.

Recent Activity

replied to Kseniase's post 2 days ago
12 Foundational AI Model Types Letโ€™s refresh some fundamentals today to stay fluent in the what we all work with. Here are some of the most popular model types that shape the vast world of AI (with examples in the brackets): 1. LLM - Large Language Model (GPT, LLaMA) -> https://huggingface.co/papers/2402.06196 + history of LLMs: https://www.turingpost.com/t/The%20History%20of%20LLMs It's trained on massive text datasets to understand and generate human language. They are mostly build on Transformer architecture, predicting the next token. LLMs scale by increasing overall parameter count across all components (layers, attention heads, MLPs, etc.) 2. SLM - Small Language Model (TinyLLaMA, Phi models, SmolLM) https://huggingface.co/papers/2410.20011 Lightweight LM optimized for efficiency, low memory use, fast inference, and edge use. SLMs work using the same principles as LLMs 3. VLM - Vision-Language Model (CLIP, Flamingo) -> https://huggingface.co/papers/2405.17247 Processes and understands both images and text. VLMs map images and text into a shared embedding space or generate captions/descriptions from both 4. MLLM - Multimodal Large Language Model (Gemini) -> https://huggingface.co/papers/2306.13549 A large-scale model that can understand and process multiple types of data (modalities) โ€” usually text + other formats, like images, videos, audio, structured data, 3D or spatial inputs. MLLMs can be LLMs extended with modality adapters or trained jointly across vision, text, audio, etc. 5. LAM - Large Action Model (InstructDiffusion, RT-2) -> https://huggingface.co/papers/2412.10047 Understands and generates action sequences by predicting action tokens (discrete/continuous instructions) that guide agents. Trained on behavior datasets, LAMs generalize across tasks, environments, and modalities - video, sensor data, etc. Read about LRM, MoE, SSM, RNN, CNN, SAM and LNN below๐Ÿ‘‡ Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe
View all activity

Organizations

None yet

Tom-Neverwinter's activity

New activity in BeaverAI/Test-12B-v1b-GGUF 2 days ago
New activity in Qwen/Qwen2.5-Omni-7B about 1 month ago

GGUF model

3
#51 opened about 1 month ago by
Tom-Neverwinter
New activity in moonshotai/MoonViT-SO-400M about 2 months ago

GGUF format

#2 opened about 2 months ago by
Tom-Neverwinter
New activity in TheDrummer/Gemmasutra-Small-4B-v1-GGUF 3 months ago

Review so far

2
#1 opened 3 months ago by
GlobalMeltdown
New activity in perplexity-ai/r1-1776 4 months ago

Was this Model Needed?

4
#12 opened 4 months ago by
fahdmirzac
New activity in multimodalart/flux-lora-the-explorer 10 months ago

how to make a lora

3
#2 opened 10 months ago by
guardiancc
New activity in bartowski/Beyonder-4x7B-v3-exl2 about 1 year ago

3.0 bpw?

16
#1 opened about 1 year ago by
CulturedMan
New activity in ai21labs/Jamba-v0.1 about 1 year ago

multiple gpu?

3
#3 opened about 1 year ago by
bdambrosio

Missing config.json

8
#2 opened over 1 year ago by
Cayleb
New activity in ShinojiResearch/Senku-70B-Full over 1 year ago

Hardware

8
#5 opened over 1 year ago by
Tom-Neverwinter