HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs Paper β’ 2503.02003 β’ Published 24 days ago β’ 45
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control Paper β’ 2407.03168 β’ Published Jul 3, 2024 β’ 3
Qwen 2.5 Coder Llamafiles (<50B) Collection Llamafiles for the smaller Qwen 2.5 Coder models β’ 6 items β’ Updated about 1 month ago β’ 1
Qwen 2.5 Llamafiles (<50B) Collection Llamafiles for the smaller Qwen 2.5 text only models β’ 6 items β’ Updated about 1 month ago β’ 1
Deepseek Distilled Llamafiles (<50B) Collection Llamafiles for the smaller Deepseek Distilled Models β’ 5 items β’ Updated Feb 25 β’ 2
DeepHermes Collection Preview models of hybrid reasoner Hermes series β’ 6 items β’ Updated 15 days ago β’ 27
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. β’ 29 items β’ Updated 2 days ago β’ 212
Gemma 3 Collection All versions of Google's new multimodal models in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. β’ 29 items β’ Updated 2 days ago β’ 43
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper β’ 2502.07374 β’ Published Feb 11 β’ 37
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. β’ 5 items β’ Updated Feb 6 β’ 52
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper β’ 2501.17161 β’ Published Jan 28 β’ 116
Reasoning Datasets Collection Distilled synthetic Reasoning datasets β’ 7 items β’ Updated Feb 2 β’ 60