System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts Paper • 2505.18962 • Published 13 days ago • 12
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation Paper • 2407.06423 • Published Jul 8, 2024
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction Paper • 2503.15661 • Published Mar 19 • 2
StarFlow: Generating Structured Workflow Outputs From Sketch Images Paper • 2503.21889 • Published Mar 27 • 1
Rendering-Aware Reinforcement Learning for Vector Graphics Generation Paper • 2505.20793 • Published 11 days ago • 11
STRICT: Stress Test of Rendering Images Containing Text Paper • 2505.18985 • Published 13 days ago
FACT: Examining the Effectiveness of Iterative Context Rewriting for Multi-fact Retrieval Paper • 2410.21012 • Published Oct 28, 2024
R$^3$Mem: Bridging Memory Retention and Retrieval via Reversible Compression Paper • 2502.15957 • Published Feb 21
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks Paper • 2504.12764 • Published Apr 17 • 41
Distilling semantically aware orders for autoregressive image generation Paper • 2504.17069 • Published Apr 23 • 6
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 285
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3 • 39
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3 • 39
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3 • 39