admarcosai
's Collections
GAIA: a benchmark for General AI Assistants
Paper
•
2311.12983
•
Published
•
183
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper
•
2311.10775
•
Published
•
7
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language
Model-based Agents in Real-world Systems
Paper
•
2311.11315
•
Published
•
6
An Embodied Generalist Agent in 3D World
Paper
•
2311.12871
•
Published
•
8
Pearl: A Production-ready Reinforcement Learning Agent
Paper
•
2312.03814
•
Published
•
14
CogAgent: A Visual Language Model for GUI Agents
Paper
•
2312.08914
•
Published
•
29
AppAgent: Multimodal Agents as Smartphone Users
Paper
•
2312.13771
•
Published
•
51
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence
Lengths in Large Language Models
Paper
•
2401.04658
•
Published
•
25
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper
•
2402.01622
•
Published
•
33
Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool
Utilization in Real-World Complex Scenarios
Paper
•
2401.17167
•
Published
•
1
Language Models, Agent Models, and World Models: The LAW for Machine
Reasoning and Planning
Paper
•
2312.05230
•
Published
Large Language Models as Zero-shot Dialogue State Tracker through
Function Calling
Paper
•
2402.10466
•
Published
•
16
An Interactive Agent Foundation Model
Paper
•
2402.05929
•
Published
•
27