Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. β’ 65 items β’ Updated 7 days ago β’ 137
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking β’ 6 items β’ Updated 26 days ago β’ 66
Sky-T1-7B Collection A series of 7B models trained with different recipes and the corresponding training data. β’ 8 items β’ Updated Feb 14 β’ 7
Light-R1 Collection Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond β’ 7 items β’ Updated Mar 13 β’ 11
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases Paper β’ 2412.04862 β’ Published Dec 6, 2024 β’ 51
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models β’ 11 items β’ Updated Dec 6, 2024 β’ 665