DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 127
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools? Paper • 2508.01780 • Published 20 days ago • 13
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools? Paper • 2508.01780 • Published 20 days ago • 13 • 5
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools? Paper • 2508.01780 • Published 20 days ago • 13
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides Paper • 2501.03936 • Published Jan 7 • 22
ConsistentChat: Building Skeleton-Guided Consistent Dialogues for Large Language Models from Scratch Paper • 2506.03558 • Published Jun 4 • 5
ConsistentChat: Building Skeleton-Guided Consistent Dialogues for Large Language Models from Scratch Paper • 2506.03558 • Published Jun 4 • 5
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published Apr 1 • 25
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published Apr 1 • 25
Running 40 40 Open LMM Reasoning Leaderboard 🥇 A Leaderboard that demonstrates LMM reasoning capabilities