From Context to Action: Analysis of the Impact of State Representation and Context on the Generalization of Multi-Turn Web Navigation Agents Paper • 2410.23555 • Published Oct 31, 2024
Better Slow than Sorry: Introducing Positive Friction for Reliable Dialogue Systems Paper • 2501.17348 • Published Jan 28
TD-EVAL: Revisiting Task-Oriented Dialogue Evaluation by Combining Turn-Level Precision with Dialogue-Level Comparisons Paper • 2504.19982 • Published Apr 28
Language Specific Knowledge: Do Models Know Better in X than in English? Paper • 2505.14990 • Published May 21 • 1
PIPA: A Unified Evaluation Protocol for Diagnosing Interactive Planning Agents Paper • 2505.01592 • Published May 2
PIPA: A Unified Evaluation Protocol for Diagnosing Interactive Planning Agents Paper • 2505.01592 • Published May 2
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models Paper • 2311.07022 • Published Nov 13, 2023 • 1
Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare Paper • 2404.16621 • Published Apr 25, 2024
ReSpAct: Harmonizing Reasoning, Speaking, and Acting Towards Building Large Language Model-Based Conversational AI Agents Paper • 2411.00927 • Published Nov 1, 2024 • 1
ReSpAct: Harmonizing Reasoning, Speaking, and Acting Towards Building Large Language Model-Based Conversational AI Agents Paper • 2411.00927 • Published Nov 1, 2024 • 1
From Context to Action: Analysis of the Impact of State Representation and Context on the Generalization of Multi-Turn Web Navigation Agents Paper • 2410.23555 • Published Oct 31, 2024
Better Slow than Sorry: Introducing Positive Friction for Reliable Dialogue Systems Paper • 2501.17348 • Published Jan 28
Language Model is All You Need: Natural Language Understanding as Question Answering Paper • 2011.03023 • Published Nov 5, 2020
VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator Paper • 2105.11589 • Published May 25, 2021