Masking Teacher and Reinforcing Student for Distilling Vision-Language Models Paper β’ 2512.22238 β’ Published 9 days ago β’ 16
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper β’ 2512.02425 β’ Published about 1 month ago β’ 23
Sleeping 17 νκ΅μ΄ TTS μλ λ π€ 17 νκ΅μ΄ TTS λͺ¨λΈμ λΈλΌμΈλ ν μ€νΈλ‘ λΉκ΅ νκ°νμΈμ!
RefineBench: Evaluating Refinement Capability of Language Models via Checklists Paper β’ 2511.22173 β’ Published Nov 27, 2025 β’ 14
Adaptive Multi-Agent Response Refinement in Conversational Systems Paper β’ 2511.08319 β’ Published Nov 11, 2025 β’ 41
Simulating Environments with Reasoning Models for Agent Training Paper β’ 2511.01824 β’ Published Nov 3, 2025 β’ 2
AgentFold: Long-Horizon Web Agents with Proactive Context Management Paper β’ 2510.24699 β’ Published Oct 28, 2025 β’ 69 β’ 4
AgentFold: Long-Horizon Web Agents with Proactive Context Management Paper β’ 2510.24699 β’ Published Oct 28, 2025 β’ 69
ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs Paper β’ 2510.04767 β’ Published Oct 6, 2025 β’ 27
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper β’ 2510.09201 β’ Published Oct 10, 2025 β’ 49
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs Paper β’ 2510.07499 β’ Published Oct 8, 2025 β’ 48
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? Paper β’ 2510.02209 β’ Published Oct 2, 2025 β’ 53
ACON: Optimizing Context Compression for Long-horizon LLM Agents Paper β’ 2510.00615 β’ Published Oct 1, 2025 β’ 32
ACON: Optimizing Context Compression for Long-horizon LLM Agents Paper β’ 2510.00615 β’ Published Oct 1, 2025 β’ 32 β’ 2
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper β’ 2509.24002 β’ Published Sep 28, 2025 β’ 174
Rethinking Reward Models for Multi-Domain Test-Time Scaling Paper β’ 2510.00492 β’ Published Oct 1, 2025 β’ 27
Small Language Models are the Future of Agentic AI Paper β’ 2506.02153 β’ Published Jun 2, 2025 β’ 23