README.md · Intel/hebrew-math-tutor-v1 at 38b1c1d17df16c7a767cca859bbcd42f48448d28

hebrew-math-tutor-v1 / README.md

danf

Update README.md

38b1c1d verified 11 days ago

preview code

raw

history blame

12.1 kB

	---
	library_name: transformers
	license: apache-2.0
	license_link: https://huggingface.co/Intel/hebrew-math-tutor-v1/blob/main/LICENSE
	pipeline_tag: text-generation
	language:
	- he
	- en
	tags:
	- mathematics
	- education
	- hebrew
	- reasoning
	- math
	- tutoring
	base_model:
	- Qwen/Qwen3-4B-Thinking-2507
	---

	# Hebrew Math Tutor

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/62d93cd728f9c86a4031562e/YxvxPWRpINziJaAftl4XE.png" width="600"/>
	</p>

	Hebrew Math Tutor is a specialized mathematical reasoning model that provides step-by-step solutions to math problems in Hebrew. Built on Qwen3-4B-Thinking-2507, this model bridges the gap between advanced AI mathematical capabilities and Hebrew-language education.


	- 🎯 Model ID: `Intel/hebrew-math-tutor-v1`
	- 🏗️ Base Model: [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)
	- 🏛️ Architecture: Decoder-only causal language model (~4B parameters)
	- 🗣️ Primary Language: Hebrew (retains multilingual capabilities)
	- 📄 License: Apache-2.0

	## Model Description

	Hebrew Math Tutor is a supervised fine-tune of Qwen3-4B-Thinking, specifically optimized to:

	- Provide detailed mathematical reasoning in Hebrew with clear step-by-step explanations
	- Maintain mathematical accuracy while adapting to Hebrew language patterns
	- Preserve multilingual capabilities for cross-language mathematical workflows
	- Support educational applications with natural Hebrew mathematical discourse

	The model excels at translating complex mathematical concepts into clear, pedagogically sound Hebrew explanations while maintaining the computational precision of its base model.

	## Intended Use Cases

	### ✅ Primary Applications

	- Educational Technology: Hebrew-language math tutoring systems and learning platforms.
	- Research Tools: Mathematical reasoning research in Hebrew educational contexts.
	- Prototype Development: Building Hebrew-first educational AI applications.
	- Accessibility: Providing advanced math AI assistance to Hebrew-speaking communities.

	### ✅ Secondary Applications

	- Multilingual educational workflows requiring Hebrew mathematical explanations.
	- Cross-cultural mathematics education research.
	- Hebrew mathematical content generation for educational materials.

	### ❌ Not Intended For

	- High-stakes assessments: Medical, legal, or financial decision-making.
	- Unsupervised grading: Certification or evaluation without human verification.
	- Production systems: Critical applications without proper validation and oversight.

	## Model Details

	\| Specification \| Details \|
	\|-----------------------\|--------------------------------------------------\|
	\| Architecture \| Decoder-only transformer (causal language model) \|
	\| Parameters \| ~4 billion \|
	\| Context Length \| Inherited from Qwen3-4B-Thinking-2507 \|
	\| Tokenizer \| Qwen3-compatible tokenizer with Hebrew support \|
	\| Training Type \| Supervised Fine-Tuning (Hebrew SFT) \|
	\| Base Model \| Qwen3-4B-Thinking-2507 \|
	\| Fine-tuning Focus \| Mathematical reasoning in Hebrew \|

	## Training Details

	### Dataset

	- Source: ~10,000 selected problems from [OpenMathReasoning](https://huggingface.co/datasets/nvidia/OpenMathReasoning).
	- Translation Approach: Automated high-quality translation using internal LLMs.
	- Language Adaptation: Questions and final answers translated to Hebrew; reasoning chains preserved.
	- Mathematical Notation: Equations and formal math notation kept intact.
	- Internal Reasoning: Model's `<think>...</think>` blocks intentionally remain in English (representing internal reasoning processes).

	### Training Configuration

	- Method: Supervised Fine-Tuning (Hebrew SFT)
	- Epochs: 3
	- Learning Rate: 5e-6
	- Warmup: 0.1
	- Scheduler: Cosine learning rate decay
	- Objective: Maintain mathematical accuracy while adapting output to Hebrew

	## Performance Evaluation

	We evaluated Hebrew Math Tutor on three challenging mathematical benchmarks: MATH500, AIME24, and AIME25.

	### Evaluation Metrics

	- pass@16: Percentage of problems where at least one of 16 generated samples is correct.
	- maj@16: Majority-vote accuracy across 16 samples.
	- Hebrew Answers: Percentage of responses generated in Hebrew.

	### Hebrew Evaluation Results

	\| Dataset \| Metric \| Base Model \| Hebrew Math Tutor \| Improvement \|
	\|-------------\|----------------\|------------\|-------------------\|-------------\|
	\| MATH500 \| pass@16 \| 93% \| 95% \| +2% \|
	\| \| maj@16 \| 88% \| 90% \| +2% \|
	\| \| Hebrew Answers \| 75% \| 100% \| +25% \|
	\| AIME24 \| pass@16 \| 76.7% \| 80% \| +3.3% \|
	\| \| maj@16 \| 76.7% \| 76.7% \| No change \|
	\| \| Hebrew Answers \| 35.2% \| 96.7% \| +61.5% \|
	\| AIME25 \| pass@16 \| 80% \| 83.3% \| +3.3% \|
	\| \| maj@16 \| 70% \| 60% \| -10% \|
	\| \| Hebrew Answers \| 36% \| 95.2% \| +59.2% \|

	### English/Original Language Results

	\| Dataset \| Metric \| Base Model \| Hebrew Math Tutor \| Change \|
	\|-------------\|---------\|------------\|-------------------\|-----------\|
	\| MATH500 \| pass@16 \| 99% \| 98% \| -1% \|
	\| \| maj@16 \| 98% \| 98% \| No change \|
	\| AIME24 \| pass@16 \| 93.3% \| 90% \| -3.3% \|
	\| \| maj@16 \| 86.7% \| 86.7% \| No change \|
	\| AIME25 \| pass@16 \| 83.3% \| 90% \| +6.7% \|
	\| \| maj@16 \| 73% \| 80% \| +7% \|

	### Key Findings

	🎯 Dramatic Language Improvement: Hebrew answer generation increased by 25-61.5% across all benchmarks, reaching 95-100% Hebrew output.

	📈 Maintained Technical Performance: Consistent improvements in pass@16 on Hebrew evaluations while preserving competitive English performance.

	🔍 Mixed Majority Vote Results: Strong performance on MATH500, stable on AIME24, with one notable decrease on AIME25 requiring further investigation.

	✅ Preserved Core Capabilities: The fine-tuning successfully adapted language output without sacrificing fundamental mathematical reasoning abilities.

	## Usage

	### Quick Start

	```python
	from transformers import pipeline

	model = "Intel/hebrew-math-tutor-v1"
	pipe = pipeline("text-generation", model)

	messages = [
	{
	"role": "system",
	"content": """You are a helpful AI assistant specialized in mathematics and problem-solving who can answer math questions with the correct answer.
	Answer shortly, not more than 500 tokens, but outline the process step by step.
	Answer ONLY in Hebrew!""",
	},
	{"role": "user", "content": "מהו סכום הסדרה הבאה: 1 + 1/2 + 1/4 + 1/8 + ..."},
	]

	out = pipe(
	messages,
	return_full_text=False,
	max_new_tokens=1024,
	temperature=0.6,
	top_p=0.95,
	top_k=20,
	)
	print(out[0]["generated_text"])
	```

	### Recommended Parameters

	- Temperature: 0.6 (balanced creativity and accuracy)
	- Top-p: 0.95 (diverse but focused sampling)
	- Top-k: 20 (controlled vocabulary selection)
	- Max tokens: 500-1024 (sufficient for detailed explanations)

	### Best Practices

	- Request explicit structure: Ask for step-by-step reasoning and clearly marked final answers.
	- Use Hebrew formatting cues: Include phrases like "תשובה סופית:" or request `\boxed{}` formatting.
	- Specify language: Explicitly request Hebrew-only responses for consistent output.
	- Verify solutions: Always validate mathematical results, especially in educational contexts.

	## Demo Interface

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/62d93cd728f9c86a4031562e/tbOIu47QLmja_z-Ce20a2.png" width="600"/>
	<br>
	<em>Example Streamlit interface showing Hebrew Math Tutor providing step-by-step reasoning. The detailed reasoning can be collapsed for cleaner presentation.</em>
	</p>

	## Limitations & Considerations

	### Technical Limitations

	- Potential errors: May produce incorrect solutions or mathematical hallucinations.
	- Language mixing: Occasional mixing of Hebrew and English or inconsistent number formatting.
	- Training biases: May reflect biases present in the original training datasets.
	- Internal reasoning: `<think>...</think>` blocks remain in English due to training scope.

	### Usage Recommendations

	- Human verification required: Always validate outputs before use in educational settings
	- Not a replacement for educators: Designed as an assistive tool, not a substitute for qualified instruction.
	- Appropriate context: Best suited for educational prototyping and research applications.

	## Ethical Guidelines

	### Responsible Deployment

	- Include clear disclaimers about AI-generated content in user-facing applications.
	- Implement human oversight for any educational or assessment applications.
	- Ensure compliance with relevant privacy laws when collecting user data.
	- Provide transparency about model capabilities and limitations.

	### Educational Impact

	- Designed to enhance, not replace, human mathematical instruction.
	- Intended to increase accessibility of advanced math AI for Hebrew speakers.
	- Should be used as part of comprehensive educational approaches with human guidance.

	## Technical Details

	### Evaluation Methodology

	- Correctness verification: Solutions validated using Math-verify framework.
	- Statistical significance: Results based on 16 samples per problem for robust evaluation.
	- Language detection: Automated classification of response language for Hebrew Answers metric.
	- Benchmark diversity: Evaluation across competition mathematics (AIME) and curriculum problems (MATH500).

	### Reproducibility

	- All evaluation protocols follow standard mathematical reasoning assessment practices.
	- Sampling parameters and evaluation metrics clearly documented.
	- Training configuration and hyperparameters provided for reproduction.

	## Attribution & Licensing

	- Model License: [Apache-2.0](https://huggingface.co/Intel/hebrew-math-tutor-v1/blob/main/LICENSE)
	- Base Model: [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) (Alibaba)
	- Training Dataset: [OpenMathReasoning](https://huggingface.co/datasets/nvidia/OpenMathReasoning) (NVIDIA)
	- Development: Intel Labs

	## Citation

	If you use Hebrew Math Tutor in your research or applications, please cite:

	```bibtex
	@misc{hebrew-math-tutor-v1,
	title={Hebrew Math Tutor: A Hebrew-focused Mathematical Reasoning Model},
	author={Intel AI},
	year={2025},
	url={https://huggingface.co/Intel/hebrew-math-tutor-v1},
	note={Fine-tuned from Qwen3-4B-Thinking-2507}
	}
	```

	## Community & Support

	- Blog Post: [more details in the blog](https://huggingface.co/blog/danf/hebrew-math-tutor).
	- Model Repository: [https://huggingface.co/Intel/hebrew-math-tutor-v1](https://huggingface.co/Intel/hebrew-math-tutor-v1)
	- Issues & Feedback: Use the Hugging Face repository issues for bug reports and feature requests.
	- Community Discussions: Join conversations in the repository discussions tab.

	## Changelog

	- v1.0 — Initial public release with Hebrew mathematical reasoning capabilities.

	---

	Hebrew Math Tutor represents a step forward in making advanced mathematical AI accessible across languages. We encourage responsible use and welcome community feedback to improve multilingual mathematical reasoning capabilities.