DeepSeek R1 (All Versions) Collection DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 12 days ago • 238
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 270
view article Article Welcome Llama 4 Maverick & Scout on Hugging Face! By burtenshaw and 6 others • Apr 5 • 145
view article Article DABStep: Data Agent Benchmark for Multi-step Reasoning By eggie5 and 5 others • Feb 4 • 90
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 By tomaarsen • Mar 26 • 135
view article Article Train 400x faster Static Embedding Models with Sentence Transformers By tomaarsen • Jan 15 • 187
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published Dec 4, 2024 • 134
view article Article Model2Vec: Distill a Small Fast Model from any Sentence Transformer By Pringled and 1 other • Oct 14, 2024 • 92
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated Apr 30 • 305
PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion Paper • 2311.01767 • Published Nov 3, 2023 • 21
CodeFusion: A Pre-trained Diffusion Model for Code Generation Paper • 2310.17680 • Published Oct 26, 2023 • 73