CHASE Collection Generate challenging synthetic data to evaluate LLMs • 4 items • Updated 1 day ago
CHASE Collection Generate challenging synthetic data to evaluate LLMs • 4 items • Updated 1 day ago
CHASE Collection Generate challenging synthetic data to evaluate LLMs • 4 items • Updated 1 day ago
CHASE Collection Generate challenging synthetic data to evaluate LLMs • 4 items • Updated 1 day ago
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Paper • 2412.04626 • Published Dec 5, 2024 • 14
The BrowserGym Ecosystem for Web Agent Research Paper • 2412.05467 • Published Dec 6, 2024 • 20