view article Article Aligning to What? Rethinking Agent Generalization in MiniMax M2 Oct 30, 2025 • 41
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9, 2025 • 751
Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published Nov 7, 2025 • 54
view article Article Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models Jul 10, 2025 • 53
DolphinLabeled Datasets Collection Eric Hartford has added labels to help you filter datasets, for your pleasure. • 5 items • Updated Jan 6, 2025 • 15
view article Article 🐺🐦⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark Jan 2, 2025 • 41
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published Dec 16, 2024 • 60
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 158
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published Dec 16, 2024 • 43
view article Article 🐺🐦⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs Dec 4, 2024 • 80