VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation Paper • 2506.03930 • Published 6 days ago • 22
Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback Paper • 2506.03106 • Published 7 days ago • 6
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning Paper • 2506.04207 • Published 6 days ago • 44
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published 11 days ago • 118
VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-instruct-grpo-69k-sys12-mtrl-d1fo-535-step Updated 11 days ago • 5
VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-instruct-grpo-69k-sys12-mtrl-d1fo-535-step Updated 11 days ago • 5
VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-instruct-grpo-69k-sys12-mtrl-d1fo-280-step Updated 11 days ago • 6
StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs Paper • 2505.20139 • Published 15 days ago • 18
StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs Paper • 2505.20139 • Published 15 days ago • 18
StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs Paper • 2505.20139 • Published 15 days ago • 18 • 1
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published 27 days ago • 93
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design Paper • 2505.16175 • Published 19 days ago • 39
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design Paper • 2505.16175 • Published 19 days ago • 39 • 3
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design Paper • 2505.16175 • Published 19 days ago • 39 • 3
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published 21 days ago • 22
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published 21 days ago • 22
VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-instruct-grpo-69k-sys12-mtrl-d1fo-280-step Updated 11 days ago • 6