WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks Paper • 2506.01952 • Published 9 days ago • 10 • 3
WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks Paper • 2506.01952 • Published 9 days ago • 10
WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks Paper • 2506.01952 • Published 9 days ago • 10
WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks Paper • 2506.01952 • Published 9 days ago • 10 • 3
A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models Paper • 2501.18463 • Published Jan 30 • 1
A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models Paper • 2501.18463 • Published Jan 30 • 1 • 1
A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models Paper • 2501.18463 • Published Jan 30 • 1
MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding Paper • 2505.20298 • Published 16 days ago • 6
MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding Paper • 2505.20298 • Published 16 days ago • 6
MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding Paper • 2505.20298 • Published 16 days ago • 6 • 2