SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents Paper • 2509.06283 • Published about 1 month ago • 17 • 2
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis Paper • 2505.13227 • Published May 19 • 45
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15 • 120
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 97
BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation Paper • 2502.03860 • Published Feb 6 • 25
Reward-Guided Speculative Decoding for Efficient LLM Reasoning Paper • 2501.19324 • Published Jan 31 • 39
Demystifying Domain-adaptive Post-training for Financial LLMs Paper • 2501.04961 • Published Jan 9 • 12
MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs Paper • 2410.04698 • Published Oct 7, 2024 • 13
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation Paper • 2407.02371 • Published Jul 2, 2024 • 54
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Paper • 2407.01370 • Published Jul 1, 2024 • 89
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning Paper • 2402.15506 • Published Feb 23, 2024 • 18
Unlocking Anticipatory Text Generation: A Constrained Approach for Faithful Decoding with Large Language Models Paper • 2312.06149 • Published Dec 11, 2023 • 3