QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
Abstract
Question-Free Fine-Tuning (QFFT) improves efficiency and adaptability in cognitive models by leveraging both short and long chain-of-thought patterns, reducing response length while maintaining performance across various scenarios.
Recent advancements in Long Chain-of-Thought (CoT) reasoning models have improved performance on complex tasks, but they suffer from overthinking, which generates redundant reasoning steps, especially for simple questions. This paper revisits the reasoning patterns of Long and Short CoT models, observing that the Short CoT patterns offer concise reasoning efficiently, while the Long CoT patterns excel in challenging scenarios where the Short CoT patterns struggle. To enable models to leverage both patterns, we propose Question-Free Fine-Tuning (QFFT), a fine-tuning approach that removes the input question during training and learns exclusively from Long CoT responses. This approach enables the model to adaptively employ both reasoning patterns: it prioritizes the Short CoT patterns and activates the Long CoT patterns only when necessary. Experiments on various mathematical datasets demonstrate that QFFT reduces average response length by more than 50\%, while achieving performance comparable to Supervised Fine-Tuning (SFT). Additionally, QFFT exhibits superior performance compared to SFT in noisy, out-of-domain, and low-resource scenarios.
Community
Models fine-tuned with Long CoT data often overthink simple questions, generating long-winded reasoning. A new SFT approach, QFFT, tackles this by removing questions during training and using only Long CoT answers for fine-tuning.
This method preserves the model’s native Short CoT ability, avoiding the “question→Long CoT” mapping trap of traditional SFT. By learning from answer structures, models can activate Long CoT’s deep thinking when facing errors or complex problems, while defaulting to efficient Short CoT for easy tasks.
The results are impressive: a 40% reduction in response length on math tasks, comparable performance to SFT, and superior noise resistance and cross-domain generalization. In noisy data, QFFT maintains 78.6% performance versus SFT’s 0.4%, and it excels in low-resource scenarios.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Learning Composable Chains-of-Thought (2025)
- Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning Eliciting Efficient Reasoning in Large Language Models (2025)
- Adaptive Deep Reasoning: Triggering Deep Thinking When Needed (2025)
- Reasoning-CV: Fine-tuning Powerful Reasoning LLMs for Knowledge-Assisted Claim Verification (2025)
- AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models (2025)
- Efficient Long CoT Reasoning in Small Language Models (2025)
- Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 4
Datasets citing this paper 2
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper