Hebrew Math Tutor

Hebrew Math Tutor is a specialized mathematical reasoning model that provides step-by-step solutions to math problems in Hebrew. Built on Qwen3-4B-Thinking-2507, this model bridges the gap between advanced AI mathematical capabilities and Hebrew-language education.

  • 🎯 Model ID: Intel/hebrew-math-tutor-v1
  • πŸ—οΈ Base Model: Qwen3-4B-Thinking-2507
  • πŸ›οΈ Architecture: Decoder-only causal language model (~4B parameters)
  • πŸ—£οΈ Primary Language: Hebrew (retains multilingual capabilities)
  • πŸ“„ License: Apache-2.0

Model Description

Hebrew Math Tutor is a supervised fine-tune of Qwen3-4B-Thinking, specifically optimized to:

  • Provide detailed mathematical reasoning in Hebrew with clear step-by-step explanations
  • Maintain mathematical accuracy while adapting to Hebrew language patterns
  • Preserve multilingual capabilities for cross-language mathematical workflows
  • Support educational applications with natural Hebrew mathematical discourse

The model excels at translating complex mathematical concepts into clear, pedagogically sound Hebrew explanations while maintaining the computational precision of its base model.

Intended Use Cases

βœ… Primary Applications

  • Educational Technology: Hebrew-language math tutoring systems and learning platforms.
  • Research Tools: Mathematical reasoning research in Hebrew educational contexts.
  • Prototype Development: Building Hebrew-first educational AI applications.
  • Accessibility: Providing advanced math AI assistance to Hebrew-speaking communities.

βœ… Secondary Applications

  • Multilingual educational workflows requiring Hebrew mathematical explanations.
  • Cross-cultural mathematics education research.
  • Hebrew mathematical content generation for educational materials.

❌ Not Intended For

  • High-stakes assessments: Medical, legal, or financial decision-making.
  • Unsupervised grading: Certification or evaluation without human verification.
  • Production systems: Critical applications without proper validation and oversight.

Model Details

Specification Details
Architecture Decoder-only transformer (causal language model)
Parameters ~4 billion
Context Length Inherited from Qwen3-4B-Thinking-2507
Tokenizer Qwen3-compatible tokenizer with Hebrew support
Training Type Supervised Fine-Tuning (Hebrew SFT)
Base Model Qwen3-4B-Thinking-2507
Fine-tuning Focus Mathematical reasoning in Hebrew

Training Details

Dataset

  • Source: ~10,000 selected problems from OpenMathReasoning.
  • Translation Approach: Automated high-quality translation using internal LLMs.
  • Language Adaptation: Questions and final answers translated to Hebrew; reasoning chains preserved.
  • Mathematical Notation: Equations and formal math notation kept intact.
  • Internal Reasoning: Model's <think>...</think> blocks intentionally remain in English (representing internal reasoning processes).

Training Configuration

  • Method: Supervised Fine-Tuning (Hebrew SFT)
  • Epochs: 3
  • Learning Rate: 5e-6
  • Warmup: 0.1
  • Scheduler: Cosine learning rate decay
  • Objective: Maintain mathematical accuracy while adapting output to Hebrew

Performance Evaluation

We evaluated Hebrew Math Tutor on three challenging mathematical benchmarks: MATH500, AIME24, and AIME25.

Evaluation Metrics

  • pass@16: Percentage of problems where at least one of 16 generated samples is correct.
  • maj@16: Majority-vote accuracy across 16 samples.
  • Hebrew Answers: Percentage of responses generated in Hebrew.

Hebrew Evaluation Results

Dataset Metric Base Model Hebrew Math Tutor Improvement
MATH500 pass@16 93% 95% +2%
maj@16 88% 90% +2%
Hebrew Answers 75% 100% +25%
AIME24 pass@16 76.7% 80% +3.3%
maj@16 76.7% 76.7% No change
Hebrew Answers 35.2% 96.7% +61.5%
AIME25 pass@16 80% 83.3% +3.3%
maj@16 70% 60% -10%
Hebrew Answers 36% 95.2% +59.2%

English/Original Language Results

Dataset Metric Base Model Hebrew Math Tutor Change
MATH500 pass@16 99% 98% -1%
maj@16 98% 98% No change
AIME24 pass@16 93.3% 90% -3.3%
maj@16 86.7% 86.7% No change
AIME25 pass@16 83.3% 90% +6.7%
maj@16 73% 80% +7%

Key Findings

🎯 Dramatic Language Improvement: Hebrew answer generation increased by 25-61.5% across all benchmarks, reaching 95-100% Hebrew output.

πŸ“ˆ Maintained Technical Performance: Consistent improvements in pass@16 on Hebrew evaluations while preserving competitive English performance.

πŸ” Mixed Majority Vote Results: Strong performance on MATH500, stable on AIME24, with one notable decrease on AIME25 requiring further investigation.

βœ… Preserved Core Capabilities: The fine-tuning successfully adapted language output without sacrificing fundamental mathematical reasoning abilities.

Usage

Quick Start

from transformers import pipeline

model = "Intel/hebrew-math-tutor-v1"
pipe = pipeline("text-generation", model)

messages = [
    {
        "role": "system",
        "content": """You are a helpful AI assistant specialized in mathematics and problem-solving who can answer math questions with the correct answer.
Answer shortly, not more than 500 tokens, but outline the process step by step.
Answer ONLY in Hebrew!""",
    },
    {"role": "user", "content": "ΧžΧ”Χ• בכום Χ”Χ‘Χ“Χ¨Χ” הבאה:  1 + 1/2 + 1/4 + 1/8 + ..."},
]

out = pipe(
    messages,
    return_full_text=False,
    max_new_tokens=1024,
    temperature=0.6,
    top_p=0.95,
    top_k=20,
)
print(out[0]["generated_text"])

Recommended Parameters

  • Temperature: 0.6 (balanced creativity and accuracy)
  • Top-p: 0.95 (diverse but focused sampling)
  • Top-k: 20 (controlled vocabulary selection)
  • Max tokens: 500-1024 (sufficient for detailed explanations)

Best Practices

  • Request explicit structure: Ask for step-by-step reasoning and clearly marked final answers.
  • Use Hebrew formatting cues: Include phrases like "ΧͺΧ©Χ•Χ‘Χ” Χ‘Χ•Χ€Χ™Χͺ:" or request \boxed{} formatting.
  • Specify language: Explicitly request Hebrew-only responses for consistent output.
  • Verify solutions: Always validate mathematical results, especially in educational contexts.

Demo Interface


Example Streamlit interface showing Hebrew Math Tutor providing step-by-step reasoning. The detailed reasoning can be collapsed for cleaner presentation.

Limitations & Considerations

Technical Limitations

  • Potential errors: May produce incorrect solutions or mathematical hallucinations.
  • Language mixing: Occasional mixing of Hebrew and English or inconsistent number formatting.
  • Training biases: May reflect biases present in the original training datasets.
  • Internal reasoning: <think>...</think> blocks remain in English due to training scope.

Usage Recommendations

  • Human verification required: Always validate outputs before use in educational settings
  • Not a replacement for educators: Designed as an assistive tool, not a substitute for qualified instruction.
  • Appropriate context: Best suited for educational prototyping and research applications.

Ethical Guidelines

Responsible Deployment

  • Include clear disclaimers about AI-generated content in user-facing applications.
  • Implement human oversight for any educational or assessment applications.
  • Ensure compliance with relevant privacy laws when collecting user data.
  • Provide transparency about model capabilities and limitations.

Educational Impact

  • Designed to enhance, not replace, human mathematical instruction.
  • Intended to increase accessibility of advanced math AI for Hebrew speakers.
  • Should be used as part of comprehensive educational approaches with human guidance.

Technical Details

Evaluation Methodology

  • Correctness verification: Solutions validated using Math-verify framework.
  • Statistical significance: Results based on 16 samples per problem for robust evaluation.
  • Language detection: Automated classification of response language for Hebrew Answers metric.
  • Benchmark diversity: Evaluation across competition mathematics (AIME) and curriculum problems (MATH500).

Reproducibility

  • All evaluation protocols follow standard mathematical reasoning assessment practices.
  • Sampling parameters and evaluation metrics clearly documented.
  • Training configuration and hyperparameters provided for reproduction.

Attribution & Licensing

Citation

If you use Hebrew Math Tutor in your research or applications, please cite:

@misc{hebrew-math-tutor-v1,
  title={Hebrew Math Tutor: A Hebrew-focused Mathematical Reasoning Model},
  author={Intel Labs},
  year={2025},
  url={https://huggingface.co/Intel/hebrew-math-tutor-v1},
  note={Fine-tuned from Qwen3-4B-Thinking-2507}
}

Community & Support

Changelog

  • v1.0 β€” Initial public release with Hebrew mathematical reasoning capabilities.

Hebrew Math Tutor represents a step forward in making advanced mathematical AI accessible across languages. We encourage responsible use and welcome community feedback to improve multilingual mathematical reasoning capabilities.

Downloads last month
32
Safetensors
Model size
4.41B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Intel/hebrew-math-tutor-v1

Finetuned
(34)
this model
Quantizations
1 model