Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model Paper • 2404.10306 • Published Apr 16, 2024 • 1
Optimizing Language Model's Reasoning Abilities with Weak Supervision Paper • 2405.04086 • Published May 7, 2024 • 2
Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning Paper • 2403.20046 • Published Mar 29, 2024
Eliminating Reasoning via Inferring with Planning: A New Framework to Guide LLMs' Non-linear Thinking Paper • 2310.12342 • Published Oct 18, 2023
SMoA: Improving Multi-agent Large Language Models with Sparse Mixture-of-Agents Paper • 2411.03284 • Published Nov 5, 2024
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation Paper • 2505.18759 • Published 18 days ago • 12
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation Paper • 2505.18759 • Published 18 days ago • 12 • 3
PhyX: Does Your Model Have the "Wits" for Physical Reasoning? Paper • 2505.15929 • Published 20 days ago • 48
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation Paper • 2505.18759 • Published 18 days ago • 12
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation Paper • 2505.18759 • Published 18 days ago • 12 • 3
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Paper • 2505.14625 • Published 21 days ago • 13
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published Feb 3 • 41
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published Feb 3 • 41 • 5
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published Feb 3 • 41
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published Feb 3 • 41 • 5