Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published 20 days ago • 90
ReasonIR: Training Retrievers for Reasoning Tasks Paper • 2504.20595 • Published 20 days ago • 52
TACO Models Collection This collection contains the best-performing TACO models based on LLaMA-3/Qwen2 and SigLIP/CLIP. • 3 items • Updated about 1 month ago • 8
CoTA Datasets Collection This collection contains all versions of the CoTA (Chain-of-Thought-and-Action) datasets. • 5 items • Updated about 1 month ago • 7
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Paper • 2408.07060 • Published Aug 13, 2024 • 43