SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 2 days ago • 43
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models Paper • 2501.16937 • Published 2 days ago • 2
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 84
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 2 days ago • 21
Are Vision Language Models Texture or Shape Biased and Can We Steer Them? Paper • 2403.09193 • Published Mar 14, 2024 • 8
Low-Rank Adapters Meet Neural Architecture Search for LLM Compression Paper • 2501.16372 • Published 8 days ago • 5
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 3 days ago • 283
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 4 days ago • 86
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated 1 day ago • 64
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published 8 days ago • 50
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 8 days ago • 267
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 8 days ago • 75
Llama 3.2 Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 27 items • Updated about 6 hours ago • 48
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos Paper • 2501.09781 • Published 14 days ago • 24