SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers
Abstract
Large Language Models (LLMs) have achieved human-level proficiency across diverse tasks, but their ability to perform rigorous mathematical problem solving remains an open challenge. In this work, we investigate a fundamental yet computationally intractable problem: determining whether a given multivariate polynomial is nonnegative. This problem, closely related to Hilbert's Seventeenth Problem, plays a crucial role in global polynomial optimization and has applications in various fields. First, we introduce SoS-1K, a meticulously curated dataset of approximately 1,000 polynomials, along with expert-designed reasoning instructions based on five progressively challenging criteria. Evaluating multiple state-of-the-art LLMs, we find that without structured guidance, all models perform only slightly above the random guess baseline 50%. However, high-quality reasoning instructions significantly improve accuracy, boosting performance up to 81%. Furthermore, our 7B model, SoS-7B, fine-tuned on SoS-1K for just 4 hours, outperforms the 671B DeepSeek-V3 and GPT-4o-mini in accuracy while only requiring 1.8% and 5% of the computation time needed for letters, respectively. Our findings highlight the potential of LLMs to push the boundaries of mathematical reasoning and tackle NP-hard problems.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models (2025)
- Theorem Prover as a Judge for Synthetic Data Generation (2025)
- VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data (2025)
- rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking (2025)
- Bag of Tricks for Inference-time Computation of LLM Reasoning (2025)
- O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning (2025)
- NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper