E-Model-Reasoner-Math-V1
This is a fine-tuned version of Qwen3-0.6B specialized for mathematical reasoning tasks, trained on the NVIDIA OpenMathReasoning dataset. The model incorporates advanced reasoning capabilities with a "thinking" mechanism to provide step-by-step mathematical problem solving.
Model Details
Model Description
E-Model-Reasoner-Math-V1 is a mathematical reasoning model built upon Qwen/Qwen3-0.6B architecture. It has been specifically fine-tuned to excel at mathematical problem-solving tasks, featuring an integrated thinking process that allows users to see the model's reasoning steps before arriving at the final answer. This transparency makes it particularly valuable for educational applications and mathematical tutoring.
- Developed by: ErenalpCet
- Model type: Causal Language Model (Fine-tuned for Mathematical Reasoning)
- Language(s) (NLP): English
- License: MIT
- Finetuned from model: Qwen/Qwen3-0.6B
- Model ID: ErenalpCet/E-Model-Reasoner-Math-V1
Model Sources
- Repository: https://huggingface.co/ErenalpCet/E-Model-Reasoner-Math-V1
- Base Model: https://huggingface.co/Qwen/Qwen3-0.6B
- Training Dataset: https://huggingface.co/datasets/nvidia/OpenMathReasoning
Uses
Direct Use
E-Model-Reasoner-Math-V1 is designed for direct mathematical problem-solving applications. It excels at:
- Solving algebraic equations and inequalities
- Arithmetic calculations with detailed explanations
- Mathematical word problems
- Step-by-step problem breakdown and reasoning
- Educational math assistance with transparent thinking process
The model's thinking mechanism allows users to understand not just the answer, but the complete reasoning process, making it ideal for learning environments.
Downstream Use
This model can be integrated into various applications including:
- Educational platforms and tutoring systems
- Math homework assistance tools
- Interactive learning applications
- Mathematical reasoning benchmarks
- Research tools for mathematical problem-solving analysis
Out-of-Scope Use
This model is specifically optimized for mathematical reasoning and may not perform optimally for:
- Non-mathematical domain questions
- Creative writing or storytelling
- Code generation outside of mathematical contexts
Bias, Risks, and Limitations
The model inherits potential biases from both its base model (Qwen3-0.6B) and the OpenMathReasoning training dataset. Key considerations include:
- Mathematical problem types may be skewed toward certain domains represented in the training data
- Performance may vary across different mathematical complexity levels
- Cultural or linguistic biases may affect word problem interpretation
- The model should not be used as the sole source for critical mathematical calculations
Recommendations
Users should:
- Verify important mathematical results independently
- Use the model as an educational aid rather than a definitive mathematical authority
- Be aware that the model's reasoning process, while helpful, may not always reflect optimal problem-solving approaches
- Test the model's performance on their specific use cases before deployment
How to Get Started with the Model
Use the code below to get started with E-Model-Reasoner-Math-V1:
import os
os.environ["HF_HOME"] = os.environ.get("HF_HOME", "E:\\hf_home")
os.environ["TRANSFORMERS_CACHE"] = os.environ.get("TRANSFORMERS_CACHE", "E:\\cache\\transformers")
os.environ["HF_HUB_CACHE"] = os.environ.get("HF_HUB_CACHE", "E:\\cache\\hub")
os.environ["HF_DATASETS_CACHE"] = os.environ.get("HF_DATASETS_CACHE", "E:\\cache\\datasets")
os.environ["HF_METRICS_CACHE"] = os.environ.get("HF_METRICS_CACHE", "E:\\cache\\metrics")
os.environ["HF_MODULES_CACHE"] = os.environ.get("HF_MODULES_CACHE", "E:\\cache\\modules")
os.environ["TOKENIZERS_CACHE"] = os.environ.get("TOKENIZERS_CACHE", "E:\\cache\\tokenizers")
os.environ["TORCH_EXTENSIONS_DIR"] = os.environ.get("TORCH_EXTENSIONS_DIR", "E:\\cache\\torch_extensions")
cache_dirs = [
os.environ["HF_HOME"], os.environ["TRANSFORMERS_CACHE"], os.environ["HF_HUB_CACHE"],
os.environ["HF_DATASETS_CACHE"], os.environ["HF_METRICS_CACHE"], os.environ["HF_MODULES_CACHE"],
os.environ["TOKENIZERS_CACHE"], os.environ["TORCH_EXTENSIONS_DIR"],
]
for d in cache_dirs:
os.makedirs(d, exist_ok=True)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
import gc
MODEL_DIR = "E:\\qwen3-math-reasoning-final"
USE_SPECIFIC_GPU = True
GPU_ID = "0"
def setup_device():
if USE_SPECIFIC_GPU and torch.cuda.is_available():
os.environ["CUDA_VISIBLE_DEVICES"] = GPU_ID
device = torch.device(f"cuda:{GPU_ID}")
print(f"Using specific GPU: {torch.cuda.get_device_name(device)}")
print(f"Available VRAM: {torch.cuda.get_device_properties(device).total_memory / 1e9:.2f} GB")
elif torch.cuda.is_available():
device = torch.device("cuda")
print(f"Using default CUDA device: {torch.cuda.get_device_name(0)}")
print(f"Available VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
device = torch.device("cpu")
print("CUDA not available. Using CPU.")
return device
def load_model_and_tokenizer(model_dir, device):
best_checkpoint = model_dir
print(f"\\nLoading model and tokenizer from: {best_checkpoint}...")
try:
tokenizer = AutoTokenizer.from_pretrained(best_checkpoint, trust_remote_code=True)
if tokenizer.pad_token is None:
print("Warning: Tokenizer does not have a pad token. Setting pad_token = eos_token.")
tokenizer.pad_token = tokenizer.eos_token
print(f"Tokenizer loaded. Pad token: '{tokenizer.pad_token}' (ID: {tokenizer.pad_token_id})")
model_config = AutoModelForCausalLM.from_pretrained(best_checkpoint, trust_remote_code=True).config
model_config.use_cache = True
model = AutoModelForCausalLM.from_pretrained(
best_checkpoint,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
config=model_config,
device_map="auto"
)
if model.config.pad_token_id is None:
model.config.pad_token_id = tokenizer.pad_token_id
print("Model loaded successfully.")
print(f"Model is on device: {model.device}")
return model, tokenizer
except Exception as e:
print(f"Error loading model or tokenizer: {e}")
raise
def generate_response(model, tokenizer, user_prompt, device):
try:
messages = [
{"role": "user", "content": user_prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Initialize streamer
streamer = TextStreamer(tokenizer, skip_special_tokens=False)
# Generate with streamer
print("\nStreaming response:")
print("-" * 50)
generated_ids = model.generate(
**model_inputs,
temperature=0.6,
top_p=0.95,
top_k=20,
min_p=0,
max_new_tokens=32768,
streamer=streamer
)
# Removed parameters for better performance.
# Process the full output
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
# Find thinking content and regular content
thinking_content = ""
content = ""
# Find </think> token ID
think_token_id = 151668
try:
# Find index of </think> token
index = output_ids.index(think_token_id) + 1
except ValueError:
index = 0
print("\n\nParsing response...")
if index > 0:
thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
return thinking_content, content
except Exception as e:
print(f"Error during response generation: {e}")
return "Error in generation.", "Sorry, I encountered an error while generating the response."
def chat_loop(model, tokenizer, device):
print("\\n--- Chatbot CLI Started ---")
print("Type your mathematical problem or 'quit'/'exit'/'q' to end.")
while True:
user_input = input("\\nYou: ")
if user_input.lower() in ["quit", "exit", "q"]:
print("Exiting chatbot. Goodbye!")
break
if not user_input.strip():
print("Please enter a problem.")
continue
print("Bot is thinking...")
thinking_response, final_response = generate_response(model, tokenizer, user_input, device)
if thinking_response:
print("\\nThinking content:")
print("-" * 50)
print(thinking_response)
print("-" * 50)
print("\\nBot:")
print("-" * 50)
print(final_response)
print("-" * 50)
if device.type == "cuda":
gc.collect()
torch.cuda.empty_cache()
if __name__ == "__main__":
selected_device = None
model_instance = None
tokenizer_instance = None
try:
selected_device = setup_device()
if not os.path.exists(MODEL_DIR):
print(f"Error: Model directory not found: {MODEL_DIR}")
print("Please ensure the MODEL_DIR variable is set correctly.")
else:
model_instance, tokenizer_instance = load_model_and_tokenizer(MODEL_DIR, selected_device)
chat_loop(model_instance, tokenizer_instance, selected_device)
except RuntimeError as e:
print(f"A runtime error occurred: {e}")
if "CUDA out of memory" in str(e):
print("Consider reducing batch size if this were training, or model size for inference.")
print("For inference, ensure your GPU has enough VRAM for the model (Qwen3-0.6B needs a few GBs).")
except Exception as e:
print(f"An unexpected error occurred: {e}")
finally:
print("\\nCleaning up...")
del model_instance
del tokenizer_instance
gc.collect()
if torch.cuda.is_available():
torch.cuda.empty_cache()
print("Cleanup complete. Exiting application.")
Training Details
Training Data
The model was fine-tuned on the NVIDIA OpenMathReasoning dataset, which contains a comprehensive collection of mathematical problems paired with detailed step-by-step solutions. This dataset covers various mathematical domains including algebra, arithmetic, geometry, and word problems.
Training Procedure
The fine-tuning process enhanced the base Qwen3-0.6B model's mathematical reasoning capabilities while preserving its general language understanding abilities.
Preprocessing
The training data was preprocessed to incorporate the thinking mechanism, allowing the model to generate internal reasoning steps before providing final answers.
Training Hyperparameters
- Training regime: bfloat16 mixed precision
- Base architecture: Qwen3-0.6B transformer
- Optimization: Fine-tuning with mathematical reasoning focus
- Special features: Thinking token integration (token ID: 151668)
Speeds, Sizes, Times
- Model parameters: ~600M (0.6B)
- Inference memory: 2–4 GB VRAM recommended for optimal performance
- Processing: Supports streaming generation for real-time responses
Evaluation
Testing Data, Factors & Metrics
Testing Data
The model should be evaluated on standard mathematical reasoning benchmarks including GSM8K, MATH dataset, and other mathematical problem-solving evaluation sets.
Factors
Evaluation considers:
- Problem complexity levels (elementary to advanced)
- Mathematical domain coverage (algebra, arithmetic, geometry, etc.)
- Reasoning clarity and correctness
- Step-by-step solution quality
Model Examination
The model's thinking mechanism provides interpretability by exposing the reasoning process. This feature allows users to:
- Understand the model's problem-solving approach
- Identify potential errors in reasoning
- Learn from the step-by-step methodology
- Verify the logical flow of solutions
Technical Specifications
Model Architecture and Objective
- Architecture: Transformer-based causal language model
- Parameters: ~600 million
- Precision: bfloat16 for optimal performance/memory balance
- Context Length: Extended context support up to 32,768 tokens
- Special Tokens: Custom thinking token mechanism for reasoning transparency
Compute Infrastructure
Hardware
- Minimum Requirements: 4GB VRAM for basic inference
- Recommended: 8GB+ VRAM for optimal performance
- CPU Fallback: Supported but significantly slower
- Multi-GPU: Automatic device mapping supported
Software
- Framework: PyTorch with Transformers library
- Python Version: 3.8+
- Key Dependencies:
torch >= 1.12.0
transformers >= 4.21.0
CUDA toolkit
(for GPU acceleration)
Citation
BibTeX
@misc{e-model-reasoner-math-v1,
title={E-Model-Reasoner-Math-V1: A Fine-tuned Mathematical Reasoning Model},
author={ErenalpCet},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/ErenalpCet/E-Model-Reasoner-Math-V1 }},
note={Fine-tuned from Qwen3-0.6B on OpenMathReasoning dataset}
}
APA
ErenalpCet. (2024). E-Model-Reasoner-Math-V1: A Fine-tuned Mathematical Reasoning Model. Hugging Face. https://huggingface.co/ErenalpCet/E-Model-Reasoner-Math-V1
Glossary
- Thinking Token: Special token (ID: 151668) that separates the model's internal reasoning from its final answer
- Mathematical Reasoning: The process of logical thinking applied to solve mathematical problems
- Fine-tuning: Process of adapting a pre-trained model to a specific task or domain
- bfloat16: Brain floating-point format that provides memory efficiency while maintaining training stability
More Information
For additional technical details, usage examples, and community discussions, visit the model repository at https://huggingface.co/ErenalpCet/E-Model-Reasoner-Math-V1 .
For questions about mathematical reasoning capabilities or specific use cases, please refer to the model's discussion section or create an issue in the repository.
Model Card Authors
ErenalpCet
Model Card Contact
For questions, feedback, or collaboration opportunities regarding E-Model-Reasoner-Math-V1, please contact through the Hugging Face platform or the model repository's discussion section.
- Downloads last month
- 0