DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation Paper • 2501.16764 • Published 9 days ago • 21
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 9 days ago • 100
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 11 days ago • 322
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 15 days ago • 301
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers Paper • 2408.06195 • Published Aug 12, 2024 • 70
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 23 days ago • 136
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 24 days ago • 89
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning Paper • 2501.06458 • Published 26 days ago • 29
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 98