Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.20453

about 23 hours ago

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 17
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 29
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 32
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

Provable Benefits of In-Tool Learning for Large Language Models

Paper • 2508.20755 • Published 5 days ago • 9
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

Hammer: Robust Function-Calling for On-Device Language Models via Function Masking

Paper • 2410.04587 • Published Oct 6, 2024 • 2
TaskCraft: Automated Generation of Agentic Tasks

Paper • 2506.10055 • Published Jun 11 • 32
Direct Multi-Turn Preference Optimization for Language Agents

Paper • 2406.14868 • Published Jun 21, 2024
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?

Paper • 2508.03644 • Published 28 days ago • 25
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published 26 days ago • 121
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19 • 46
facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 1.58k • 517
nvidia/OpenMathReasoning

Viewer • Updated May 27 • 5.68M • 8.84k • 333
Search Arena: Analyzing Search-Augmented LLMs

Paper • 2506.05334 • Published Jun 5 • 17

about 23 hours ago

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 17
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 29
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 32
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

Provable Benefits of In-Tool Learning for Large Language Models

Paper • 2508.20755 • Published 5 days ago • 9
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?

Paper • 2508.03644 • Published 28 days ago • 25
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published 26 days ago • 121
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

Hammer: Robust Function-Calling for On-Device Language Models via Function Masking

Paper • 2410.04587 • Published Oct 6, 2024 • 2
TaskCraft: Automated Generation of Agentic Tasks

Paper • 2506.10055 • Published Jun 11 • 32
Direct Multi-Turn Preference Optimization for Language Agents

Paper • 2406.14868 • Published Jun 21, 2024
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 6 days ago • 52

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19 • 46
facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 1.58k • 517
nvidia/OpenMathReasoning

Viewer • Updated May 27 • 5.68M • 8.84k • 333
Search Arena: Analyzing Search-Augmented LLMs

Paper • 2506.05334 • Published Jun 5 • 17

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs