BFS-Prover-V2-7B / README.md
RanXinByteDance's picture
Update README.md
bc433dd verified
|
raw
history blame
3.84 kB
metadata
license: apache-2.0
base_model:
  - Qwen/Qwen2.5-Math-7B
pipeline_tag: text-generation
tags:
  - lean4
  - step-prover

πŸš€ BFS-Prover-V2: Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers

arXiv License: Apache 2.0 Lean 4

State-of-the-art tactic generation model in Lean4

This repository contains the latest tactic generator model checkpoint from BFS-Prover-V2, a state-of-the-art step-level theorem proving system in Lean4. While the full BFS-Prover-V2 system integrates multiple components for scalable theorem proving, we are releasing the core tactic generation model here. Given a proof state in Lean4, the model generates a tactic that transforms the current proof state into a new state, progressively working towards completing the proof.

πŸ“„ Paper: Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers

✨ Model Details

  • Base Model: Qwen2.5-32B
  • Training Approach: Multi-stage expert iteration with best-first tree search
  • Training Data Sources:
    • Mathlib (via LeanDojo)
    • Lean-Github repositories
    • Autoformalized NuminaMath datasets

πŸ“ˆ Performance

BFS-Prover-V2-32B achieves 95.08% on the miniF2F test, when integrated with the planner-based multi-agent tree search system, which significantly outperforms all previous step-provers. Additionally, the model demonstrates strong generalization to undergraduate-level mathematics, independently attaining 41.4% on the ProofNet test without a planner.

βš™οΈ Usage

  • The model expects Lean4 tactic states in the format "{state}:::"
  • ::: serves as a special indicator to signal the model to generate a tactic for the given state.
  • The model will echo back the input state followed by the generated tactic.
# Example code for loading and using the tactic generator model
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("ByteDance-Seed/BFS-Prover-V2-32B")
tokenizer = AutoTokenizer.from_pretrained("ByteDance-Seed/BFS-Prover-V2-32B")

# imo_1964_p2 from miniF2F
state = """a b c : ℝ

    hβ‚€ : 0 < a ∧ 0 < b ∧ 0 < c

    h₁ : c < a + b

    hβ‚‚ : b < a + c

    h₃ : a < b + c

    ⊒ a ^ 2 * (b + c - a) + b ^ 2 * (c + a - b) + c ^ 2 * (a + b - c) ≀ 3 * a * b * c"""

# Tactic generation
sep = ":::"
prompt = state + sep
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
tactic = tokenizer.decode(outputs[0], skip_special_tokens=True).split(sep)[1]
print(tactic)

# Generated tactic: "nlinarith [sq_nonneg (a - b), sq_nonneg (c - a), sq_nonneg (b - c)]"

πŸ“š Citation

If you use this model in your research, please cite our paper:

@article{xin2025bfsproverv2,
  title={Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers},
  author={Xin, Ran and Zheng, Zeyu and Nie, Yanchen and Yuan, Kun and Xiao, Xia},
  journal={arXiv preprint arXiv:2509.06493},
  year={2025}
}

πŸ“„ License

https://choosealicense.com/licenses/apache-2.0/

πŸ“§ Contact

For questions and feedback about the tactic generator model, please contact: