view article Article Distributed Training with JAX and Flax NNX: A Practical Guide to Sharding By jiagaoxiang • Mar 26 • 7
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated Apr 28 • 119
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 105
Deliberation in Latent Space via Differentiable Cache Augmentation Paper • 2412.17747 • Published Dec 23, 2024 • 33
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published Dec 9, 2024 • 83
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 3 days ago • 155
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated Apr 30 • 305
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5, 2024 • 256
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29, 2024 • 54
view article Article PaliGemma – Google's Cutting-Edge Open Vision Language Model By merve and 2 others • May 14, 2024 • 253
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15, 2024 • 60
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 11 days ago • 147
Probably function calling datasets Collection Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17, 2024 • 38
view article Article Serverless Inference with Hugging Face and NVIDIA NIMs By philschmid and 1 other • Jul 29, 2024 • 31
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models By andito and 2 others • Jun 24, 2024 • 194
GenQA: Generating Millions of Instructions from a Handful of Prompts Paper • 2406.10323 • Published Jun 14, 2024 • 5
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published Jun 2, 2024 • 34