Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models
Numina & Kimi Team

We're excited to announce the release of Kimina-Prover-72B, our state-of-the-art theorem proving model trained with the Kimi k1.5[1] RL pipeline based on Qwen2.5-72B [2]. Alongside it, we are also releasing two distilled variants: Kimina-Prover-Distill-8B and 1.7B (based on Qwen3-8B and Qwen3-1.7B[3] respectively).
Our key innovations include:
Test-Time Reinforcement Learning Search: A trainable agentic proving framework that enables the model to recursively discover, combine and apply multiple lemmas to construct complex proofs, building on a novel lemma-enabled pattern.
Error-Fixing Capability: Kimina-Prover can read and interpret Leanβs error messages and propose targeted fixes, demonstrating significantly higher sample efficiency compared to regenerating proofs from scratch.
These advancements enable Kimina-Prover to solve challenging mathematical problems and surpass prior methods. As shown in Figure 1, on the widely used miniF2F benchmark, Kimina-Prover achieves a state-of-the-art pass rate of 92.2%.
Introduction
We focus on automated theorem proving (ATP) in the Lean 4 language, aiming to automate the construction of formal mathematical proofs. Recent advances in neural theorem proving have significantly improved the ability of AI systems to assist with or automate this process. Notable progress includes AlphaProof[4] from Google DeepMind, which demonstrated strong performance on problems at the level of the International Mathematical Olympiad. Open-source systems such as DeepSeek-Prover-V2[5], which incorporates reinforcement learning, have also achieved state-of-the-art results. In addition, neuro-symbolic agentic approaches like DSP+[6] have shown that competitive performance is possible without large-scale training by leveraging off-the-shelf models in a modular framework.
Our earlier work of Kimina-Prover Preview[7] introduced a large language model for formal theorem proving in Lean, establishing a new performance baseline on the miniF2F benchmark. Trained with a large-scale reinforcement learning pipeline, the model adopted a reasoning-driven exploration paradigm and demonstrated that larger models can serve as stronger formal reasoners. Its structured reasoning pattern enabled efficient proof search and emulated human-like problem-solving strategies.
Following this initial success, we continued to improve the model through further iterations of reinforcement learning. However, single-step reasoning remains insufficient for solving complex problems that require long, multi-stage proofs. To address this limitation, we introduce the Test-Time Reinforcement Learning (TTRL) search framework, which enables the model to discover, combine, and reuse multiple intermediate lemmas autonomously. This framework supports deeper, long-horizon reasoning by decomposing hard problems into reusable subcomponents.
A key ingredient of TTRL search is the lemma-enabled pattern, which allows the model to identify and apply intermediate lemmas as part of its proof construction process. This structured reuse of intermediate results significantly expands the modelβs problem-solving capability beyond single-step generation.
To further enhance robustness, we also integrate an error-fixing mechanism that interprets Leanβs error messages and proposes targeted corrections. This allows the model to refine its outputs through iterative feedback, improving proof reliability and overall sample efficiency.
A New State-of-the-Art
Model | pass@1 | pass@32 | pass@1024 |
---|---|---|---|
Kimina-Prover-1.7B | 46.7 | 73.4 | β |
DSP+ | 52.5 | 71.3 | 80.7 |
DeepSeek-Prover-V2-7B | 58.6 | 75.6 | 79.9 |
Kimina-Prover-8B | 61.1 | 78.3 | β |
DeepSeek-Prover-V2-671B | 61.9 | 82.4 | 86.6 |
Kimina-Prover-72B | 63.9 | 84.0 | 87.7 |
As the combination of the proposed techniques results in significant improvements in formal theorem proving performance. On the miniF2F benchmark, Kimina-Prover achieves a pass rate of 84.0% with pass@32 and 86.4% with the addition of a single round of error correction. With pass@1024, the pass rate reaches 87.7%. Applying the full Test-Time Reinforcement Learning (TTRL) search framework yields a final pass rate of 92.2%, with an estimated pass upper bound of approximately 42,000. However, this pass budget can be substantially optimized in future versions, as a large portion of the current sampling is spent on proving unhelpful or redundant lemmas.
Notably, these results indicate a shift in the scaling behavior of the prover. While earlier versions exhibited approximately linear improvements in log scale with increasing sampling budget, the current system shows diminishing returns beyond pass@1024. This suggests that further gains are less dependent on increased sampling and instead require more sophisticated search strategies, such as those introduced by TTRL.
Methodology
Lemma enabled pattern
The lemma-enabled pattern is designed to equip the model with the ability to identify and utilize useful lemmas provided in the input. To support this capability, during reinforcement learning (RL) training, a random subset of one to three formal lemmas is prepended to the problem context, exposing the prover to potentially useful intermediate results that can aid in constructing the final proof. These lemmas are prepared through a two-step pipeline: (1) a general-purpose LLM generates candidate lemmas in natural language; (2) these are then translated into formal statements using our Kimina-Autoformalizer-7b
.
Initial observations showed that the model had a low tendency to incorporate the provided lemmas. To address this, we introduced a preference-based reward shaping strategy within the RL framework. When a theorem could be proven through multiple trajectories, solutions that successfully leveraged a provided lemma were assigned a higher reward, while those that did not were penalized. This approach proved effective, increasing the lemma utilization rate to a stable 30β40% after training. Importantly, this method not only encouraged lemma usage but also promoted selectivity: the model learned to apply lemmas strategically when they were useful, while ignoring irrelevant ones, demonstrating more efficient and human-like reasoning behavior.
TTRL search
The lemma-enabled pattern allowed the model to incorporate pre-generated lemmas as intermediate steps in constructing proofs. However, we observed that randomly sampling and inserting lemmas was insufficient for solving complex problems requiring highly structured and deeply nested reasoning. To address this limitation, we developed the Test-Time Reinforcement Learning Search frameworkβa trainable, agentic approach that systematically organizes, filters, and composes a large set of candidate lemmas to build complete proofs. This framework transitions the process from random exploration to a more strategic, goal-directed search.
As illustrated in Figure 2, we define the search scope of each problem as the problem itself along with its associated candidate lemmas. TTRL Search tracks a lemma utilization score within each search scope to measure how frequently and effectively each lemma contributes to the final proof. At the beginning of each RL training iteration, for each problem (i.e., search scope), we construct K = 10 input variants by prepending different combinations of lemmas. To balance exploration and exploitation, 60% of the inputs are built using the top-ranking lemmasβthose with the highest utilization scoresβfocusing the model on the most promising proof paths. The remaining 40% are formed by including these top lemmas alongside one to four randomly selected additional lemmas, encouraging exploration of novel and potentially useful lemma combinations.
To ensure quality, a filtering mechanism prunes lemmas that consistently fail to contribute meaningfully: any lemma that fails to achieve a utilization score of at least Ο=0.10 after 50 insertion attempts is removed from the search pool.
A key feature of TTRL is its recursive search mechanism. The search scope is not limited to the original theorem but is also maintained for each lemma, allowing the framework to recursively decompose the problem into smaller subproblems. A parallel sublemma generation process runs throughout, and whenever a theorem or lemma fails to find a proof after N = 128 attempts, new candidate sublemmas are generated. This recursive strategy enables test-time scaling of reasoning depth, significantly extending the modelβs effective problem-solving capability.
To maintain logical soundness, we address a failure mode wherein incorrectly formalized lemmas could lead to trivial or invalid proofs. In such cases, the model may exploit inconsistencies to construct seemingly valid but unsound solutions. To prevent this, we introduce a negation proving process: for each newly generated lemma, we attempt to prove its logical negation. If the negated statement is provable, it indicates that the original lemma is logically inconsistent and is immediately discarded. This step ensures the reliability and soundness of the overall proof construction process.
Enabling Error-Fixing in Kimina-Prover
A critical limitation of recent advanced theorem proving models is their lack of the ability to correct proofs based on feedback from the proof assistantβcapabilities that human users routinely leverage. To address this gap, we developed a dedicated framework to integrate error-fixing capabilities into Kimina-Prover.
SFT Data Generation for Error Correction. General-purpose large language models exhibit a low success rate when interpreting Leanβs error messages and proposing valid corrections. To overcome this, we constructed a specialized Supervised Fine-Tuning (SFT) dataset tailored for error correction. This dataset consists of triplets in the form of (incorrect proof, Lean feedback, correct proof). To enrich the supervision signal, we prompted Claude 3.7 sonnet[8] to synthesize step-by-step reasoning chains explaining how the incorrect proof can be transformed into the correct one using the provided feedback. The result is a high-quality dataset that includes not only the initial and corrected proofs but also the intermediate reasoning, facilitating more effective learning.
Batched Failure Replay Strategy. Integrating error correction directly into the reinforcement learning (RL) loop initially proved ineffective due to the low success rate (~1%) of the SFT model in fixing errors, resulting in sparse rewards and unstable training. To address this, we designed the Batched Failure Replay strategy. Rather than attempting to correct errors immediately during an RL iteration, all failed proof attempts from iteration N are collected. In the subsequent iteration N+1, the training batch is composed of a fixed-size union of these prior failures (e.g., 500 samples) and standard problems from the prompt set (e.g., another 500 samples). This ensures consistent and high-volume exposure to error correction tasks in each training step, enabling the model to gradually learn effective error-handling behaviors in a stable and data-efficient manner.
16+16 attempt-and-fix | 32Γ1 brute-force | 32+32 attempt-and-fix | |
---|---|---|---|
kimina-prover | 35.6 | 28.8 | 44.1 |
This training approach led to measurable improvements in the modelβs ability to recover from failure. The model advanced from correcting basic syntax errors to resolving complex logical mistakes and, eventually, discovering alternative proof strategies when initial attempts failed. Crucially, this capability improved sample efficiency. As shown in Table 2, we compared different strategies under fixed computational budgets. The 16+16 attempt-and-fix strategyβconsisting of 16 initial proof attempts, each followed by one error-fixing attemptβachieved a success rate of 35.6%, outperforming the 32Γ1 brute-force baseline, which reached 28.8% with 32 independent attempts. Further increasing the sample budget to 32+32 with error correction yielded a success rate of 44.1%. These results demonstrate that enabling the model to correct its own errors is a more efficient use of computational resources than repeated trial-and-error.
Other improvement
In addition to our core TTRL search and Error fixing, we developed several other novel techniques to enhance the model's learning process and problem-solving capabilities.
Prompt Set Curation and Iteration. Effective data curation is critical for reinforcement learning (RL) efficiency. Our initial prompt set of over 300K problems was refined to a competition-focused subset of approximately 90K, with an emphasis on problems from sources such as the olympiads-ref subset of NuminaMath 1.5[9]. During RL, we applied dynamic filtering: problems with consistently high success rates were removed to maintain challenge, while persistently difficult problems were decomposed into simpler subproblems or variations using a general-purpose LLM. The final prompt set combines auto-formalized statements for broad coverage, human-annotated problems for quality, and augmented variations for targeted skill development. The curated prompt set will be open-sourced.
Continuous Pre-training on Lean Data. To address the limited Lean proficiency in most base models, we applied Continuous Pre-training (CPT). A 6-billion-token CPT dataset was constructed from multiple sources: 260M tokens from online platforms such as GitHub, 5.5B tokens of compiler-validated rollout data from our RL pipeline, and additional structured data in stateβtacticβstate
and stateβtacticβerror
formats to enhance diversity. These structured data supplements the base model with additional domain knowledge.
Random Proof Cut Data Augmentation. To better utilize high-quality, human-annotated proofs that lack intermediate reasoning steps, we developed the Random Proof Cut technique. This method augments such data using two variations:
- Proof truncation: The final part of a proof is removed, and the model must complete it.
- Proof infilling: A randomly selected internal block is replaced with the placeholder token
sorry
, and the model must reconstruct the missing steps.
This strategy allows the model to learn how to extend existing lines of reasoning and to fill in logical gaps, effectively integrating human-generated content into the training process.
Non-proof Problem Solving. Many problems are presented in the form of requiring a final answer, which cannot be naturally framed as a traditional proof task. Inspired by the CombiBench[10] evaluation methodology, we introduced a capability for non-proof problem solving to unify answer generation with proof construction. In this setting, the model is given a problem with the final answer field left blank. It performs a two-stage process within a single inference: it first deduces the correct answer and then generates a complete formal proof to justify that answer. This approach ensures the modelβs reasoning is explicitly conditioned on reaching the correct solution.
Sample proof cases
Here we provide some proof cases found by our newest Kimina-prover model. The complete list of solved miniF2F problems is available on GitHub
TTRL search proof example
The system recursively decomposed the problem into four layers of lemmas using a TTRL search approach, ultimately generating a 520-line formal proof. During this recursive proving process, initially, only one lemma, sum_cexp_div_pow_ne_zero
, remained to be solved. While this conclusion seems straightforward to prove informally, it would still require a lot of steps to prove it rigorously in Lean 4. Our system recursively generated two extra layers of lemmas and obtained a 260-line formal proof for sum_cexp_div_pow_ne_zero
. This proof is valid under Mathlib v4.15
Sample proof for imo 1969 p2
Proof structure

Formal code
import Mathlib
open Finset in
theorem norm_le_sum_norm_of_sum_eq_zero {N : β} (hN : N > 0) (z : β β β)
(hsum : β i β range N, z i = 0) :
βz 0β β€ β i β Icc 1 (N - 1), βz iβ := by
by_cases h : N > 1
Β·
have h6 : β i in range N, z i = z 0 + β i in Icc 1 (N - 1), z i := by
have h10 : range N = {0} βͺ Icc 1 (N - 1) := by
ext x
simp
omega
rw [h10]
rw [sum_union]
Β·
simp
Β·
apply disjoint_left.mpr
intro x hx1 hx2
simp at hx1
simp [hx1] at hx2
have h3 : z 0 = - (β i β Icc 1 (N - 1), z i) := by
have h4 : β i in range N, z i = 0 := hsum
rw [h6] at h4
calc
z 0 = (z 0 + β i in Icc 1 (N - 1), z i) - β i in Icc 1 (N - 1), z i := by ring
_ = (0 : β) - β i in Icc 1 (N - 1), z i := by rw [h4]
_ = - (β i in Icc 1 (N - 1), z i) := by ring
have h11 : βz 0β = β(β i β Icc 1 (N - 1), z i : β)β := by
rw [h3]
simp [norm_neg]
have h12 : β(β i β Icc 1 (N - 1), z i : β)β β€ β i β Icc 1 (N - 1), βz iβ := by
apply norm_sum_le
linarith [h11, h12]
Β·
have h5 : N = 1 := by
omega
rw [h5]
have h6 : β i in range 1, z i = 0 := by
have h7 : N = 1 := h5
have h8 : β i in range N, z i = 0 := hsum
rw [show N = 1 by omega] at h8
simpa using h8
have h7 : z 0 = 0 := by
simp at h6
all_goals
try tauto
rw [show z 0 = (0 : β) by exact h7]
simp
open Finset in
theorem sum_pow_one_half_from_two_to_N {N : β} (hN : 0 < N) :
β j β Icc 1 (N - 1), (1 / 2 : β) ^ (j + 1) = 1 / 2 - 1 / 2 ^ N := by
have h2 : β (k : β), β j in Icc 1 (k), (1 / 2 : β) ^ (j + 1) = 1 / 2 - 1 / 2 ^ (k + 1 : β) := by
intro k
induction k with
| zero =>
simp
| succ k ihk =>
rw [sum_Icc_succ_top (by linarith)]
rw [ihk]
field_simp at *
ring_nf
have h4 : β j in Icc 1 (N - 1), (1 / 2 : β) ^ (j + 1) = 1 / 2 - 1 / 2 ^ (N : β) := by
have h5 : β j in Icc 1 (N - 1), (1 / 2 : β) ^ (j + 1) = 1 / 2 - 1 / 2 ^ ((N - 1 : β) + 1 : β) := by
apply h2 (N - 1)
rw [h5]
have h6 : (N - 1 : β) + 1 = N := by
omega
rw [h6]
exact h4
open Complex Finset in
theorem sum_eq_zero_of_sq_add_sq_eq_geom_seq_false {N : β} (hN : 0 < N) (x y : Fin N β β)
(hxsum : β i : Fin N, x i = 0) (hysum : β i : Fin N, y i = 0)
(hxy : β j : Fin N, (x j)^2 + (y j)^2 = (1/4)^(j.1+1)) :
False := by
let z : β β β := fun i => if h : i < N then (x β¨i, hβ© : β) + (y β¨i, hβ© : β) * I else 0
have hsumz : β i in range N, z i = 0 := by
have h4 : β i in range N, z i = (β i : Fin N, ((x i : β) + (y i : β) * I)) := by
simp [z, Finset.sum_fin_eq_sum_range, Finset.sum_congr]
rw [h4]
simp [Complex.ext_iff, Finset.sum_congr, hxsum, hysum]
have hz0 : βz 0β β€ β i β Icc 1 (N - 1), βz iβ := norm_le_sum_norm_of_sum_eq_zero hN z hsumz
have h2 : βz 0β = (1 / 2 : β) := by
have h6 : z 0 = (x β¨0, by omegaβ© : β) + (y β¨0, by omegaβ© : β) * I := by
have h7 : (0 : β) < N := by omega
simp [z, h7]
rw [h6]
have h7 : ((x β¨0, by omegaβ© : β) ^ 2 + (y β¨0, by omegaβ© : β) ^ 2) = (1 / 4 : β) := by
have h8 := hxy (β¨0, by omegaβ©)
norm_num at h8 β’
nlinarith
have h8 : β((x β¨0, by omegaβ© : β) + (y β¨0, by omegaβ© : β) * I)β = Real.sqrt ((x β¨0, by omegaβ© : β) ^ 2 + (y β¨0, by omegaβ© : β) ^ 2) := by
simp [Complex.norm_eq_abs, Complex.abs, Complex.normSq]
all_goals
ring_nf
rw [h8, h7]
have h9 : Real.sqrt ((1 / 4 : β)) = (1 / 2 : β) := by
rw [Real.sqrt_eq_iff_mul_self_eq] <;> norm_num
rw [h9]
have h3 : β i β Icc 1 (N - 1), βz iβ = β i β Icc 1 (N - 1), (1 / 2 : β) ^ (i + 1 : β) := by
apply Finset.sum_congr
.
rfl
.
intro i hi
have h10 : i β₯ 1 := by
have h11 : i β Icc (1 : β) (N - 1) := by
simpa using hi
simp at h11 β’
omega
have h11 : i β€ N - 1 := by
have h12 : i β Icc (1 : β) (N - 1) := by
simpa using hi
simp at h12 β’
omega
have h12 : i < N := by
omega
have h13 : z i = (x β¨i, by omegaβ© : β) + (y β¨i, by omegaβ© : β) * I := by
have h14 : i < N := by omega
simp [z, h14]
have h14 : βz iβ = (1 / 2 : β) ^ (i + 1 : β) := by
rw [h13]
have h14 : ((x β¨i, by omegaβ© : β) ^ 2 + (y β¨i, by omegaβ© : β) ^ 2) = (1 / 4 : β) ^ (i + 1 : β) := by
have h15 := hxy (β¨i, by omegaβ©)
norm_num at h15 β’
nlinarith
have h15 : β((x β¨i, by omegaβ© : β) + (y β¨i, by omegaβ© : β) * I)β = Real.sqrt ((x β¨i, by omegaβ© : β) ^ 2 + (y β¨i, by omegaβ© : β) ^ 2) := by
simp [Complex.norm_eq_abs, Complex.abs, Complex.normSq]
all_goals
ring_nf
rw [h15]
have h16 : Real.sqrt ((x β¨i, by omegaβ© : β) ^ 2 + (y β¨i, by omegaβ© : β) ^ 2) = (1 / 2 : β) ^ (i + 1 : β) := by
have h17 : ((x β¨i, by omegaβ© : β) ^ 2 + (y β¨i, by omegaβ© : β) ^ 2) = (1 / 4 : β) ^ (i + 1 : β) := h14
rw [h17]
have h18 : Real.sqrt ((1 / 4 : β) ^ (i + 1 : β)) = (1 / 2 : β) ^ (i + 1 : β) := by
have h19 : ((1 / 4 : β) : β) = (1 / 2 : β) ^ (2 : β) := by norm_num
rw [h19]
have h20 : Real.sqrt (((1 / 2 : β) ^ (2 : β)) ^ (i + 1 : β)) = (1 / 2 : β) ^ (i + 1 : β) := by
have h21 : ((1 / 2 : β) ^ (2 : β)) ^ (i + 1 : β) = ((1 / 2 : β) ^ (i + 1 : β)) ^ 2 := by
have h22 : ((1 / 2 : β) ^ (2 : β)) ^ (i + 1 : β) = (1 / 2 : β) ^ (2 * (i + 1 : β) : β) := by
rw [β pow_mul]
all_goals ring_nf
have h23 : ((1 / 2 : β) ^ (i + 1 : β)) ^ 2 = (1 / 2 : β) ^ (2 * (i + 1 : β) : β) := by
ring_nf
rw [h22, h23]
rw [h21]
have h22 : Real.sqrt (((1 / 2 : β) ^ (i + 1 : β)) ^ 2) = (1 / 2 : β) ^ (i + 1 : β) := by
rw [Real.sqrt_sq (show (1 / 2 : β) ^ (i + 1 : β) β₯ 0 by positivity)]
rw [h22]
exact h20
exact h18
exact h16
linarith
rw [h3] at hz0
rw [h2] at hz0
have h6 : β i β Icc 1 (N - 1), (1 / 2 : β) ^ (i + 1 : β) = (1 / 2 - 1 / 2 ^ N : β) := by
have h7 : β i β Icc 1 (N - 1), (1 / 2 : β) ^ (i + 1 : β) = (1 / 2 - 1 / 2 ^ N : β) := sum_pow_one_half_from_two_to_N hN
linarith
rw [h6] at hz0
have h7 : (1 / 2 ^ N : β) > 0 := by
have h10 : (1 / 2 ^ N : β) > 0 := by
have h11 : (1 / 2 ^ N : β) = (1 / 2 : β) ^ N := by
ring_nf
rw [h11]
positivity
linarith
have h11 : (1 / 2 - 1 / 2 ^ N : β) < (1 / 2 : β) := by
have h12 : (1 / 2 ^ N : β) > 0 := h7
linarith
linarith
open BigOperators Real Nat Topology Rat in
theorem sum_cexp_div_pow_ne_zero {N : β} (hN : N > 0) (a : Fin N β β) :
((β j : Fin N, cos (a j) / 2 ^ (j : β))^2 +
(β j : Fin N, sin (a j) / 2 ^ (j : β))^2) > 0 := by
by_contra h
push_neg at h
have h1 : (β j : Fin N, cos (a j) / 2 ^ (j : β)) ^ 2 β₯ 0 := by
apply sq_nonneg
have h2 : (β j : Fin N, sin (a j) / 2 ^ (j : β)) ^ 2 β₯ 0 := by
apply sq_nonneg
have h3 : (β j : Fin N, cos (a j) / 2 ^ (j : β)) ^ 2 = 0 := by
linarith
have h4 : (β j : Fin N, sin (a j) / 2 ^ (j : β)) ^ 2 = 0 := by
linarith
have h5 : (β j : Fin N, cos (a j) / 2 ^ (j : β)) = 0 := by
nlinarith
have h6 : (β j : Fin N, sin (a j) / 2 ^ (j : β)) = 0 := by
nlinarith
let x : Fin N β β := fun j => cos (a j) / 2 ^ (j.1 + 1 : β)
let y : Fin N β β := fun j => sin (a j) / 2 ^ (j.1 + 1 : β)
have hxsum : β i : Fin N, x i = 0 := by
have eq2 : β i : Fin N, x i = (β i : Fin N, (cos (a i) / 2 ^ (i.1 : β) : β)) / 2 := by
have eq3 : β i : Fin N, x i = β i : Fin N, (cos (a i) / 2 ^ (i.1 + 1 : β) : β) := by
congr
rw [eq3]
have eq4 : β i : Fin N, (cos (a i) / 2 ^ (i.1 + 1 : β) : β) = (β i : Fin N, (cos (a i) / 2 ^ (i.1 : β) : β)) / 2 := by
calc
β i : Fin N, (cos (a i) / 2 ^ (i.1 + 1 : β) : β)
= β i : Fin N, ((cos (a i) / 2 ^ (i.1 : β) : β) / 2) := by
apply Finset.sum_congr
.
rfl
intro i hi
have h12 : (cos (a i) / 2 ^ (i.1 + 1 : β) : β) = (cos (a i) / 2 ^ (i.1 : β) : β) / 2 := by
have h11 : (2 : β) ^ (i.1 + 1 : β) = (2 : β) ^ (i.1 : β) * 2 := by
ring_nf
rw [h11]
field_simp
all_goals ring
exact h12
_ = (β i : Fin N, (cos (a i) / 2 ^ (i.1 : β) : β)) / 2 := by
simp [Finset.sum_div]
exact eq4
rw [eq2]
rw [show β i : Fin N, (cos (a i) / 2 ^ (i.1 : β) : β) = (0 : β) by simpa using h5]
all_goals norm_num
have hysum : β i : Fin N, y i = 0 := by
have eq2 : β i : Fin N, y i = (β i : Fin N, (sin (a i) / 2 ^ (i.1 : β) : β)) / 2 := by
have eq3 : β i : Fin N, y i = β i : Fin N, (sin (a i) / 2 ^ (i.1 + 1 : β) : β) := by
congr
rw [eq3]
have eq4 : β i : Fin N, (sin (a i) / 2 ^ (i.1 + 1 : β) : β) = (β i : Fin N, (sin (a i) / 2 ^ (i.1 : β) : β)) / 2 := by
calc
β i : Fin N, (sin (a i) / 2 ^ (i.1 + 1 : β) : β)
= β i : Fin N, ((sin (a i) / 2 ^ (i.1 : β) : β) / 2) := by
apply Finset.sum_congr
.
rfl
intro i hi
have h12 : (sin (a i) / 2 ^ (i.1 + 1 : β) : β) = (sin (a i) / 2 ^ (i.1 : β) : β) / 2 := by
have h11 : (2 : β) ^ (i.1 + 1 : β) = (2 : β) ^ (i.1 : β) * 2 := by
ring_nf
rw [h11]
field_simp
all_goals ring
exact h12
_ = (β i : Fin N, (sin (a i) / 2 ^ (i.1 : β) : β)) / 2 := by
simp [Finset.sum_div]
exact eq4
rw [eq2]
rw [show β i : Fin N, (sin (a i) / 2 ^ (i.1 : β) : β) = (0 : β) by simpa using h6]
all_goals norm_num
have hxy : β j : Fin N, (x j) ^ 2 + (y j) ^ 2 = (1 / 4 : β) ^ (j.1 + 1 : β) := by
intro j
have h11 : cos (a j) ^ 2 + sin (a j) ^ 2 = 1 := by
exact Real.cos_sq_add_sin_sq (a j)
simp [x, y]
have h12 : (4 : β) ^ (j.1 + 1 : β) = (2 : β) ^ (2 * (j.1 + 1 : β)) := by
have h13 : (4 : β) = (2 : β) ^ (2 : β) := by norm_num
rw [h13]
rw [β pow_mul]
have h13 : (2 : β) ^ (2 * (j.1 + 1 : β)) = ((2 : β) ^ (j.1 + 1 : β)) ^ 2 := by
ring_nf
field_simp [h11, h12, h13]
all_goals
ring_nf
have h12 : False := sum_eq_zero_of_sq_add_sq_eq_geom_seq_false (show 0 < N by omega) x y hxsum hysum hxy
tauto
open BigOperators Real Nat Topology Rat in
theorem mul_cos_sub_mul_sin_eq_mul_cos_add (C S : β) (h : C ^ 2 + S ^ 2 β 0) :
β R Ξ± : β, R > 0 β§ β x : β, C * cos x - S * sin x = R * cos (x + Ξ±) := by
have h1 : C ^ 2 + S ^ 2 > 0 := by
by_contra h2
push_neg at h2
have h3 : C ^ 2 β₯ 0 := sq_nonneg C
have h4 : S ^ 2 β₯ 0 := sq_nonneg S
have h5 : C ^ 2 + S ^ 2 = 0 := by nlinarith
tauto
have h2 : β R : β, R > 0 β§ R ^ 2 = C ^ 2 + S ^ 2 := by
use Real.sqrt (C ^ 2 + S ^ 2)
constructor
Β· -- Show that sqrt (C^2 + S^2) > 0
apply Real.sqrt_pos.mpr
linarith
Β· -- Show that (sqrt (C^2 + S^2)) ^ 2 = C^2 + S^2
rw [Real.sq_sqrt]
positivity
rcases h2 with β¨R, hR_pos, hR_sqβ©
have h12 : (C / R) ^ 2 + (S / R) ^ 2 = 1 := by
have h14 : R β 0 := by linarith
have h15 : R ^ 2 = C ^ 2 + S ^ 2 := hR_sq
field_simp at *
nlinarith
have h13 : β Ξ± : β, Real.cos Ξ± = C / R β§ Real.sin Ξ± = S / R := by
have h9 : (C / R) ^ 2 + (S / R) ^ 2 = 1 := h12
by_cases hC : C β₯ 0
Β· -- C β₯ 0
use Real.arcsin (S / R)
constructor
Β· -- Show cos (arcsin (S / R)) = C / R
have h14 : Real.cos (Real.arcsin (S / R)) = Real.sqrt (1 - (S / R) ^ 2) := by
rw [Real.cos_arcsin]
have h15 : Real.sqrt (1 - (S / R) ^ 2) = C / R := by
have h151 : (C / R) ^ 2 = 1 - (S / R) ^ 2 := by
linarith [h9]
have h161 : Real.sqrt (1 - (S / R) ^ 2) = Real.sqrt ((C / R) ^ 2) := by
rw [show 1 - (S / R) ^ 2 = (C / R) ^ 2 by linarith [h9]]
rw [h161]
have h171 : Real.sqrt ((C / R) ^ 2) = C / R := by
apply Real.sqrt_sq (show 0 β€ C / R by
have h211 : R > 0 := hR_pos
apply div_nonneg
linarith
linarith
)
rw [h171]
rw [h14, h15]
Β· -- Show sin (arcsin (S / R)) = S / R
have h20 : -1 β€ S / R := by
have h6 : (S / R) ^ 2 β€ 1 := by nlinarith [h9]
nlinarith [sq_nonneg (S / R - 1), sq_nonneg (S / R + 1)]
have h21 : S / R β€ 1 := by
have h6 : (S / R) ^ 2 β€ 1 := by nlinarith [h9]
nlinarith [sq_nonneg (S / R - 1), sq_nonneg (S / R + 1)]
have h18 : Real.sin (Real.arcsin (S / R)) = S / R := by
apply Real.sin_arcsin
all_goals linarith
rw [h18]
Β· -- C < 0
have hC2 : C < (0 : β) := by linarith
have hC3 : C / R < 0 := by
have hR_pos1 : R > 0 := hR_pos
apply div_neg_of_neg_of_pos hC2 (by linarith)
use Real.pi - Real.arcsin (S / R)
constructor
Β· -- Show cos (Ο - arcsin (S / R)) = C / R
have h28 : Real.cos (Real.pi - Real.arcsin (S / R)) = - Real.sqrt (1 - (S / R) ^ 2) := by
have h1 : Real.cos (Real.pi - Real.arcsin (S / R)) = - Real.cos (Real.arcsin (S / R)) := by
rw [Real.cos_pi_sub]
have h2 : Real.cos (Real.arcsin (S / R)) = Real.sqrt (1 - (S / R) ^ 2) := by
rw [Real.cos_arcsin]
rw [h1, h2]
have h29 : Real.sqrt (1 - (S / R) ^ 2) = - (C / R) := by
have h301 : (C / R) ^ 2 = 1 - (S / R) ^ 2 := by
linarith [h9]
have h311 : Real.sqrt (1 - (S / R) ^ 2) = Real.sqrt ((C / R) ^ 2) := by
rw [show 1 - (S / R) ^ 2 = (C / R) ^ 2 by linarith [h9]]
rw [h311]
have h321 : Real.sqrt ((C / R) ^ 2) = - (C / R) := by
have h331 : C / R < 0 := hC3
have : Real.sqrt ((C / R) ^ 2) = - (C / R) := by
rw [Real.sqrt_sq_eq_abs]
rw [abs_of_neg h331]
linarith
linarith
rw [h28, h29]
all_goals nlinarith
Β· -- Show sin (Ο - arcsin (S / R)) = S / R
have h30 : Real.sin (Real.pi - Real.arcsin (S / R)) = Real.sin (Real.arcsin (S / R)) := by
rw [Real.sin_pi_sub]
have h31 : Real.sin (Real.arcsin (S / R)) = S / R := by
have h20 : -1 β€ S / R := by
have h6 : (S / R) ^ 2 β€ 1 := by nlinarith [h9]
nlinarith [sq_nonneg (S / R - 1), sq_nonneg (S / R + 1)]
have h21 : S / R β€ 1 := by
have h6 : (S / R) ^ 2 β€ 1 := by nlinarith [h9]
nlinarith [sq_nonneg (S / R - 1), sq_nonneg (S / R + 1)]
apply Real.sin_arcsin
all_goals linarith
rw [h30, h31]
rcases h13 with β¨Ξ±, h14, h15β©
use R, Ξ±
constructor
Β· -- Show R > 0
linarith
Β· -- Show β x : β, C * cos x - S * sin x = R * cos (x + Ξ±)
intro x
have h21 : Real.cos (x + Ξ±) = Real.cos x * Real.cos Ξ± - Real.sin x * Real.sin Ξ± := by
rw [Real.cos_add]
have h22 : Real.cos Ξ± = C / R := by
linarith [h14]
have h23 : Real.sin Ξ± = S / R := by
linarith [h15]
calc
C * Real.cos x - S * Real.sin x
= R * (Real.cos x * (C / R) - Real.sin x * (S / R)) := by
field_simp [show R β 0 by linarith]
all_goals ring
_ = R * (Real.cos x * Real.cos Ξ± - Real.sin x * Real.sin Ξ±) := by
rw [show Real.cos Ξ± = C / R by linarith [h14], show Real.sin Ξ± = S / R by linarith [h15]]
_ = R * Real.cos (x + Ξ±) := by
rw [h21]
theorem sum_cos_div_two_pow_eq_mul_cos (N : β) (a : β β β) (hN : N > 0) :
β R0 Ξ± : β, R0 > 0 β§ β x : β, β j : Fin N, Real.cos (a j + x) / 2 ^ j.1 =
(R0 : β) * Real.cos (x + Ξ±) := by
have h2 : ((β j : Fin N, Real.cos (a j) / 2 ^ (j : β)) ^ 2 + (β j : Fin N, Real.sin (a j) / 2 ^ (j : β)) ^ 2) > 0 := by
apply sum_cexp_div_pow_ne_zero (by omega) (fun j : Fin N => a j)
have h4 : (β j : Fin N, Real.cos (a j) / 2 ^ (j : β)) ^ 2 + (β j : Fin N, Real.sin (a j) / 2 ^ (j : β)) ^ 2 β 0 := by
linarith
have h5 : β R Ξ± : β, R > 0 β§ β x : β, (β j : Fin N, Real.cos (a j) / 2 ^ (j : β)) * Real.cos x - (β j : Fin N, Real.sin (a j) / 2 ^ (j : β)) * Real.sin x = R * Real.cos (x + Ξ±) := by
apply mul_cos_sub_mul_sin_eq_mul_cos_add (β j : Fin N, Real.cos (a j) / 2 ^ (j : β)) (β j : Fin N, Real.sin (a j) / 2 ^ (j : β)) (by
exact h4
)
rcases h5 with β¨R0, Ξ±, hR0_pos, h_eqβ©
use R0, Ξ±
constructor
.
exact hR0_pos
.
intro x
have h_eq2 : β j : Fin N, Real.cos (a j + x) / 2 ^ j.1 = (β j : Fin N, Real.cos (a j) / 2 ^ (j : β)) * Real.cos x - (β j : Fin N, Real.sin (a j) / 2 ^ (j : β)) * Real.sin x := by
have h8 : β j : Fin N, Real.cos (a j + x) / (2 : β) ^ (j : β) = (Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : β) ^ (j : β) := by
intro j
have h9 : Real.cos (a j + x) = Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x := by
rw [Real.cos_add]
rw [h9]
have h10 : β j : Fin N, Real.cos (a j + x) / 2 ^ j.1 = β j : Fin N, ((Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : β) ^ (j : β)) := by
apply Finset.sum_congr
.
rfl
.
intro j hj
have h9 : Real.cos (a j + x) / (2 : β) ^ (j : β) = (Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : β) ^ (j : β) := h8 j
simpa using h9
rw [h10]
have h11 : β j : Fin N, ((Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : β) ^ (j : β)) =
(β j : Fin N, Real.cos (a j) / (2 : β) ^ (j : β)) * Real.cos x - (β j : Fin N, Real.sin (a j) / (2 : β) ^ (j : β)) * Real.sin x := by
have h12 : β j : Fin N, ((Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : β) ^ (j : β)) =
(Real.cos (a j) / (2 : β) ^ (j : β)) * Real.cos x - (Real.sin (a j) / (2 : β) ^ (j : β)) * Real.sin x := by
intro j
ring_nf
calc
β j : Fin N, ((Real.cos (a j) * Real.cos x - Real.sin (a j) * Real.sin x) / (2 : β) ^ (j : β))
= β j : Fin N, ((Real.cos (a j) / (2 : β) ^ (j : β)) * Real.cos x - (Real.sin (a j) / (2 : β) ^ (j : β)) * Real.sin x) := by
apply Finset.sum_congr
.
rfl
.
intro j hj
specialize h12 j
linarith
_ = (β j : Fin N, (Real.cos (a j) / (2 : β) ^ (j : β)) * Real.cos x) - (β j : Fin N, (Real.sin (a j) / (2 : β) ^ (j : β)) * Real.sin x) := by
rw [Finset.sum_sub_distrib]
_ = (β j : Fin N, Real.cos (a j) / (2 : β) ^ (j : β)) * Real.cos x - (β j : Fin N, Real.sin (a j) / (2 : β) ^ (j : β)) * Real.sin x := by
rw [Finset.sum_mul, Finset.sum_mul]
exact h11
rw [h_eq2]
specialize h_eq x
linarith
open Real Set in
theorem sub_eq_int_mul_pi_of_cos_eq_zero :
β (ΞΈβ ΞΈβ : β), cos ΞΈβ = 0 β cos ΞΈβ = 0 β β m : β€, ΞΈβ - ΞΈβ = m * Ο := by
intro ΞΈβ ΞΈβ hβ hβ
have h1 : β k : β€, ΞΈβ = Real.pi / 2 + k * Real.pi := by
rw [Real.cos_eq_zero_iff] at hβ
rcases hβ with β¨k, hkβ©
use k
linarith
rcases h1 with β¨k, hk1β©
have h2 : β l : β€, ΞΈβ = Real.pi / 2 + l * Real.pi := by
rw [Real.cos_eq_zero_iff] at hβ
rcases hβ with β¨l, hlβ©
use l
linarith
rcases h2 with β¨l, hl2β©
use l - k
rw [hl2, hk1]
field_simp at *
<;> ring_nf <;> norm_num
open BigOperators Real Nat Topology Rat in
theorem imo_1969_p2 (m n : β) (k : β) (a : β β β) (y : β β β) (hβ : 0 < k)
(hβ : β x, y x = β i in Finset.range k, Real.cos (a i + x) / 2 ^ i) (hβ : y m = 0)
(hβ : y n = 0) : β t : β€, m - n = t * Real.pi := by
have h4 : y m = β i in Finset.range k, Real.cos (a i + m) / 2 ^ i := by
specialize hβ m
simpa using hβ
have h5 : y n = β i in Finset.range k, Real.cos (a i + n) / 2 ^ i := by
specialize hβ n
simpa using hβ
rw [h4] at hβ
rw [h5] at hβ
have h9 : β R0 Ξ± : β, R0 > 0 β§ β x : β, β j : Fin k, Real.cos (a j + x) / 2 ^ j.1 = R0 * Real.cos (x + Ξ±) := by
apply sum_cos_div_two_pow_eq_mul_cos k (fun (j : β) => a j) (by omega)
rcases h9 with β¨R0, Ξ±, hR0, h_eq3β©
have h10 : β i in Finset.range k, Real.cos (a i + m) / 2 ^ i = R0 * Real.cos (m + Ξ±) := by
have h11 : β i in Finset.range k, Real.cos (a i + m) / 2 ^ i = β j : Fin k, Real.cos (a j + m) / 2 ^ j.1 := by
simp [Finset.sum_range]
rw [h11]
specialize h_eq3 m
simpa using h_eq3
have h11 : β i in Finset.range k, Real.cos (a i + n) / 2 ^ i = R0 * Real.cos (n + Ξ±) := by
have h12 : β i in Finset.range k, Real.cos (a i + n) / 2 ^ i = β j : Fin k, Real.cos (a j + n) / 2 ^ j.1 := by
simp [Finset.sum_range]
rw [h12]
specialize h_eq3 n
simpa using h_eq3
have h10' : R0 * Real.cos (m + Ξ±) = 0 := by
have h_eq10 : β i in Finset.range k, Real.cos (a i + m) / 2 ^ i = 0 := by
linarith [hβ, h10]
linarith [h10, h_eq10]
have h11' : R0 * Real.cos (n + Ξ±) = 0 := by
have h_eq11 : β i in Finset.range k, Real.cos (a i + n) / 2 ^ i = 0 := by
linarith [hβ, h11]
linarith [h11, h_eq11]
have h12 : Real.cos (m + Ξ±) = 0 := by
apply (mul_left_inj' (ne_of_gt hR0)).mp
linarith
have h13 : Real.cos (n + Ξ±) = 0 := by
apply (mul_left_inj' (ne_of_gt hR0)).mp
linarith
have h14 : β m1 : β€, (n + Ξ±) - (m + Ξ±) = m1 * Real.pi := by
apply sub_eq_int_mul_pi_of_cos_eq_zero (m + Ξ±) (n + Ξ±) h12 h13
rcases h14 with β¨m1, hm1β©
use -m1
have hm4 : (m : β) - (n : β) = - (m1 * Real.pi) := by
have hm3 : (n + Ξ± : β) - (m + Ξ± : β) = (m1 : β) * Real.pi := by
linarith [hm1]
nlinarith
have hm5 : (-m1 : β€) * Real.pi = - (m1 * Real.pi) := by
simp
rw [hm5]
linarith [hm4]
Error fixing example
In the following example, the initial proof timed out because directly applying simp
to ((Real.sqrt 3 / 2 : β) + (1 / 2) * Complex.I) ^ 12
resulted in excessive search complexity. Breaking down the proof by step-by-step calculating intermediate powers like $z^3$, then $z^6$, and finally $z^{12}$ significantly reduced the computational burden on the compiler.
Sample proof of Error fixing
Proof before error fixing
import Mathlib
theorem algebra_299086 {z : β} (hz : z = (Real.sqrt 3 / 2) + (1 / 2) * .I) : z ^ 2016 = 1 := by
have h1 : z = (Real.sqrt 3 / 2) + (1 / 2) * Complex.I := hz
-- Prove z^12 = 1
have h2 : z ^ 12 = 1 := by
rw [h1]
have h3 : ((Real.sqrt 3 / 2 : β) + (1 / 2) * Complex.I) ^ 12 = 1 := by
simp [pow_succ, pow_zero, Complex.ext_iff, Complex.I_mul_I, mul_add, add_mul, mul_comm, mul_assoc, pow_two, pow_three, pow_succ, Real.sqrt_sq]
field_simp [Complex.ext_iff, mul_add, add_mul, pow_two, pow_three]
<;> ring_nf <;> norm_num [Real.sqrt_sq]
exact h3
calc
z ^ 2016 = (z ^ 12) ^ 168 := by ring
_ = 1 ^ 168 := by rw [h2]
_ = 1 := by norm_num
Error message
Error message:
Lean process timed out
Proof after error fixing
import Mathlib
theorem algebra_299086 {z : β} (hz : z = (Real.sqrt 3 / 2) + (1 / 2) * .I) : z ^ 2016 = 1 := by
have h1 : z = (Real.sqrt 3 / 2) + (1 / 2) * Complex.I := by
exact_mod_cast hz
have h2 : z ^ 3 = Complex.I := by
rw [h1]
simp [pow_three, Complex.ext_iff, Complex.add_re, Complex.add_im, Complex.mul_re, Complex.mul_im, mul_add, add_mul, pow_two, mul_comm, mul_assoc]
field_simp
<;> ring_nf <;> norm_num
have h3 : z ^ 6 = -1 := by
calc
z ^ 6 = (z ^ 3) ^ 2 := by ring
_ = Complex.I ^ 2 := by rw [h2]
_ = -1 := by simp [Complex.I_sq]
have h4 : z ^ 12 = 1 := by
calc
z ^ 12 = (z ^ 6) ^ 2 := by ring
_ = (-1) ^ 2 := by rw [h3]
_ = 1 := by norm_num
calc
z ^ 2016 = (z ^ 12) ^ 168 := by
ring_nf
_ = (1) ^ 168 := by
rw [h4]
_ = 1 := by
norm_num
Citation
If you'd like to cite Kimina-Prover, please feel free to reference our preview paper for now. A full paper is currently in preparation and will be available soon.
@article{kimina_prover_2025,
title = {Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning},
author = {Wang, Haiming and Unsal, Mert and Lin, Xiaohan and Baksys, Mantas and Liu, Junqi and Santos, Marco Dos and Sung, Flood and Vinyes, Marina and Ying, Zhenzhe and Zhu, Zekai and Lu, Jianqiao and SaxcΓ©, Hugues de and Bailey, Bolton and Song, Chendong and Xiao, Chenjun and Zhang, Dehao and Zhang, Ebony and Pu, Frederick and Zhu, Han and Liu, Jiawei and Bayer, Jonas and Michel, Julien and Yu, Longhui and Dreyfus-Schmidt, LΓ©o and Tunstall, Lewis and Pagani, Luigi and Machado, Moreira and Bourigault, Pauline and Wang, Ran and Polu, Stanislas and Barroyer, Thibaut and Li, Wen-Ding and Niu, Yazhe and Fleureau, Yann and Hu, Yangyang and Yu, Zhouliang and Wang, Zihan and Yang, Zhilin and Liu, Zhengying and Li, Jia},
year = {2025},
url = {http://arxiv.org/abs/2504.11354},
}
References
[1]Team, Kimi, et al. "Kimi k1. 5: Scaling reinforcement learning with llms." arXiv preprint arXiv:2501.12599 (2025).
[2]Qwen, et al. "Qwen2.5 Technical Report" arXiv preprint arXiv:2412.15115 (2024).
[3]Yang, An, et al. "Qwen3 technical report." arXiv preprint arXiv:2505.09388 (2025).
[4]https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/
[5]Ren, Z. Z., et al. "Deepseek-prover-v2: Advancing formal mathematical reasoning via reinforcement learning for subgoal decomposition." arXiv preprint arXiv:2504.21801 (2025).
[6]Cao, Chenrui, et al. "Reviving DSP for Advanced Theorem Proving in the Era of Reasoning Models." arXiv preprint arXiv:2506.11487 (2025).
[7]Wang, Haiming, et al. "Kimina-prover preview: Towards large formal reasoning models with reinforcement learning." arXiv preprint arXiv:2504.11354 (2025).
[8]https://assets.anthropic.com/m/785e231869ea8b3b/original/claude-3-7-sonnet-system-card.pdf
[9]Li, Jia, et al. "Numinamath: The largest public dataset in ai4maths with 860k pairs of competition math problems and solutions." Hugging Face repository 13 (2024): 9.
[10]Liu, Junqi, et al. "CombiBench: Benchmarking LLM capability for combinatorial mathematics." arXiv preprint arXiv:2505.03171 (2025).