You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

E-Model-Reasoner-Math-V1

This is a fine-tuned version of Qwen3-0.6B specialized for mathematical reasoning tasks, trained on the NVIDIA OpenMathReasoning dataset. The model incorporates advanced reasoning capabilities with a "thinking" mechanism to provide step-by-step mathematical problem solving.

Model Details

Model Description

E-Model-Reasoner-Math-V1 is a mathematical reasoning model built upon Qwen/Qwen3-0.6B architecture. It has been specifically fine-tuned to excel at mathematical problem-solving tasks, featuring an integrated thinking process that allows users to see the model's reasoning steps before arriving at the final answer. This transparency makes it particularly valuable for educational applications and mathematical tutoring.

  • Developed by: ErenalpCet
  • Model type: Causal Language Model (Fine-tuned for Mathematical Reasoning)
  • Language(s) (NLP): English
  • License: MIT
  • Finetuned from model: Qwen/Qwen3-0.6B
  • Model ID: ErenalpCet/E-Model-Reasoner-Math-V1

Model Sources

Uses

Direct Use

E-Model-Reasoner-Math-V1 is designed for direct mathematical problem-solving applications. It excels at:

  • Solving algebraic equations and inequalities
  • Arithmetic calculations with detailed explanations
  • Mathematical word problems
  • Step-by-step problem breakdown and reasoning
  • Educational math assistance with transparent thinking process

The model's thinking mechanism allows users to understand not just the answer, but the complete reasoning process, making it ideal for learning environments.

Downstream Use

This model can be integrated into various applications including:

  • Educational platforms and tutoring systems
  • Math homework assistance tools
  • Interactive learning applications
  • Mathematical reasoning benchmarks
  • Research tools for mathematical problem-solving analysis

Out-of-Scope Use

This model is specifically optimized for mathematical reasoning and may not perform optimally for:

  • Non-mathematical domain questions
  • Creative writing or storytelling
  • Code generation outside of mathematical contexts

Bias, Risks, and Limitations

The model inherits potential biases from both its base model (Qwen3-0.6B) and the OpenMathReasoning training dataset. Key considerations include:

  • Mathematical problem types may be skewed toward certain domains represented in the training data
  • Performance may vary across different mathematical complexity levels
  • Cultural or linguistic biases may affect word problem interpretation
  • The model should not be used as the sole source for critical mathematical calculations

Recommendations

Users should:

  • Verify important mathematical results independently
  • Use the model as an educational aid rather than a definitive mathematical authority
  • Be aware that the model's reasoning process, while helpful, may not always reflect optimal problem-solving approaches
  • Test the model's performance on their specific use cases before deployment

How to Get Started with the Model

Use the code below to get started with E-Model-Reasoner-Math-V1:

import os

os.environ["HF_HOME"]               = os.environ.get("HF_HOME", "E:\\hf_home")
os.environ["TRANSFORMERS_CACHE"]    = os.environ.get("TRANSFORMERS_CACHE", "E:\\cache\\transformers")
os.environ["HF_HUB_CACHE"]          = os.environ.get("HF_HUB_CACHE", "E:\\cache\\hub")
os.environ["HF_DATASETS_CACHE"]     = os.environ.get("HF_DATASETS_CACHE", "E:\\cache\\datasets")
os.environ["HF_METRICS_CACHE"]      = os.environ.get("HF_METRICS_CACHE", "E:\\cache\\metrics")
os.environ["HF_MODULES_CACHE"]      = os.environ.get("HF_MODULES_CACHE", "E:\\cache\\modules")
os.environ["TOKENIZERS_CACHE"]      = os.environ.get("TOKENIZERS_CACHE", "E:\\cache\\tokenizers")
os.environ["TORCH_EXTENSIONS_DIR"]  = os.environ.get("TORCH_EXTENSIONS_DIR", "E:\\cache\\torch_extensions")
cache_dirs = [
    os.environ["HF_HOME"], os.environ["TRANSFORMERS_CACHE"], os.environ["HF_HUB_CACHE"],
    os.environ["HF_DATASETS_CACHE"], os.environ["HF_METRICS_CACHE"], os.environ["HF_MODULES_CACHE"],
    os.environ["TOKENIZERS_CACHE"], os.environ["TORCH_EXTENSIONS_DIR"],
]
for d in cache_dirs:
    os.makedirs(d, exist_ok=True)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
import gc

MODEL_DIR = "E:\\qwen3-math-reasoning-final"
USE_SPECIFIC_GPU = True
GPU_ID = "0"

def setup_device():
    if USE_SPECIFIC_GPU and torch.cuda.is_available():
        os.environ["CUDA_VISIBLE_DEVICES"] = GPU_ID
        device = torch.device(f"cuda:{GPU_ID}")
        print(f"Using specific GPU: {torch.cuda.get_device_name(device)}")
        print(f"Available VRAM: {torch.cuda.get_device_properties(device).total_memory / 1e9:.2f} GB")
    elif torch.cuda.is_available():
        device = torch.device("cuda")
        print(f"Using default CUDA device: {torch.cuda.get_device_name(0)}")
        print(f"Available VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    else:
        device = torch.device("cpu")
        print("CUDA not available. Using CPU.")
    return device

def load_model_and_tokenizer(model_dir, device):
    best_checkpoint = model_dir
    print(f"\\nLoading model and tokenizer from: {best_checkpoint}...")
    try:
        tokenizer = AutoTokenizer.from_pretrained(best_checkpoint, trust_remote_code=True)
        if tokenizer.pad_token is None:
            print("Warning: Tokenizer does not have a pad token. Setting pad_token = eos_token.")
            tokenizer.pad_token = tokenizer.eos_token
        print(f"Tokenizer loaded. Pad token: '{tokenizer.pad_token}' (ID: {tokenizer.pad_token_id})")
        
        model_config = AutoModelForCausalLM.from_pretrained(best_checkpoint, trust_remote_code=True).config
        model_config.use_cache = True

        model = AutoModelForCausalLM.from_pretrained(
            best_checkpoint,
            torch_dtype=torch.bfloat16, 
            trust_remote_code=True,
            config=model_config,
            device_map="auto"
        )
        if model.config.pad_token_id is None:
            model.config.pad_token_id = tokenizer.pad_token_id
        print("Model loaded successfully.")
        print(f"Model is on device: {model.device}")
        return model, tokenizer
    except Exception as e:
        print(f"Error loading model or tokenizer: {e}")
        raise

def generate_response(model, tokenizer, user_prompt, device):
    try:
        messages = [
            {"role": "user", "content": user_prompt}
        ]
        text = tokenizer.apply_chat_template(
            messages,
            tokenize=False,
            add_generation_prompt=True,
            enable_thinking=True
        )
        model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
        
        # Initialize streamer
        streamer = TextStreamer(tokenizer, skip_special_tokens=False)
        
        # Generate with streamer
        print("\nStreaming response:")
        print("-" * 50)
        
        generated_ids = model.generate(
            **model_inputs,
            temperature=0.6,
            top_p=0.95,
            top_k=20,
            min_p=0,
            max_new_tokens=32768,
            streamer=streamer
        )
        
        # Removed parameters for better performance.

        # Process the full output
        output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
        
        # Find thinking content and regular content
        thinking_content = ""
        content = ""
        
        # Find </think> token ID
        think_token_id = 151668
        try:
            # Find index of </think> token
            index = output_ids.index(think_token_id) + 1
        except ValueError:
            index = 0
            
        print("\n\nParsing response...")
        
        if index > 0:
            thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
            
        content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
        
        return thinking_content, content
        
    except Exception as e:
        print(f"Error during response generation: {e}")
        return "Error in generation.", "Sorry, I encountered an error while generating the response."

def chat_loop(model, tokenizer, device):
    print("\\n--- Chatbot CLI Started ---")
    print("Type your mathematical problem or 'quit'/'exit'/'q' to end.")
    while True:
        user_input = input("\\nYou: ")
        if user_input.lower() in ["quit", "exit", "q"]:
            print("Exiting chatbot. Goodbye!")
            break
        if not user_input.strip():
            print("Please enter a problem.")
            continue
        
        print("Bot is thinking...")
        thinking_response, final_response = generate_response(model, tokenizer, user_input, device)
        
        if thinking_response:
            print("\\nThinking content:")
            print("-" * 50)
            print(thinking_response)
            print("-" * 50)

        print("\\nBot:")
        print("-" * 50)
        print(final_response)
        print("-" * 50)
        
        if device.type == "cuda":
            gc.collect()
            torch.cuda.empty_cache()

if __name__ == "__main__":
    selected_device = None
    model_instance = None
    tokenizer_instance = None
    try:
        selected_device = setup_device()
        if not os.path.exists(MODEL_DIR):
            print(f"Error: Model directory not found: {MODEL_DIR}")
            print("Please ensure the MODEL_DIR variable is set correctly.")
        else:
            model_instance, tokenizer_instance = load_model_and_tokenizer(MODEL_DIR, selected_device)
            chat_loop(model_instance, tokenizer_instance, selected_device)
    except RuntimeError as e:
        print(f"A runtime error occurred: {e}")
        if "CUDA out of memory" in str(e):
            print("Consider reducing batch size if this were training, or model size for inference.")
            print("For inference, ensure your GPU has enough VRAM for the model (Qwen3-0.6B needs a few GBs).")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
    finally:
        print("\\nCleaning up...")
        del model_instance
        del tokenizer_instance
        gc.collect()
        if torch.cuda.is_available():
            torch.cuda.empty_cache()
        print("Cleanup complete. Exiting application.")

Training Details

Training Data

The model was fine-tuned on the NVIDIA OpenMathReasoning dataset, which contains a comprehensive collection of mathematical problems paired with detailed step-by-step solutions. This dataset covers various mathematical domains including algebra, arithmetic, geometry, and word problems.

Training Procedure

The fine-tuning process enhanced the base Qwen3-0.6B model's mathematical reasoning capabilities while preserving its general language understanding abilities.

Preprocessing

The training data was preprocessed to incorporate the thinking mechanism, allowing the model to generate internal reasoning steps before providing final answers.

Training Hyperparameters

  • Training regime: bfloat16 mixed precision
  • Base architecture: Qwen3-0.6B transformer
  • Optimization: Fine-tuning with mathematical reasoning focus
  • Special features: Thinking token integration (token ID: 151668)

Speeds, Sizes, Times

  • Model parameters: ~600M (0.6B)
  • Inference memory: 2–4 GB VRAM recommended for optimal performance
  • Processing: Supports streaming generation for real-time responses

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model should be evaluated on standard mathematical reasoning benchmarks including GSM8K, MATH dataset, and other mathematical problem-solving evaluation sets.

Factors

Evaluation considers:

  • Problem complexity levels (elementary to advanced)
  • Mathematical domain coverage (algebra, arithmetic, geometry, etc.)
  • Reasoning clarity and correctness
  • Step-by-step solution quality

Model Examination

The model's thinking mechanism provides interpretability by exposing the reasoning process. This feature allows users to:

  • Understand the model's problem-solving approach
  • Identify potential errors in reasoning
  • Learn from the step-by-step methodology
  • Verify the logical flow of solutions

Technical Specifications

Model Architecture and Objective

  • Architecture: Transformer-based causal language model
  • Parameters: ~600 million
  • Precision: bfloat16 for optimal performance/memory balance
  • Context Length: Extended context support up to 32,768 tokens
  • Special Tokens: Custom thinking token mechanism for reasoning transparency

Compute Infrastructure

Hardware

  • Minimum Requirements: 4GB VRAM for basic inference
  • Recommended: 8GB+ VRAM for optimal performance
  • CPU Fallback: Supported but significantly slower
  • Multi-GPU: Automatic device mapping supported

Software

  • Framework: PyTorch with Transformers library
  • Python Version: 3.8+
  • Key Dependencies:
    • torch >= 1.12.0
    • transformers >= 4.21.0
    • CUDA toolkit (for GPU acceleration)

Citation

BibTeX

@misc{e-model-reasoner-math-v1,
  title={E-Model-Reasoner-Math-V1: A Fine-tuned Mathematical Reasoning Model},
  author={ErenalpCet},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/ErenalpCet/E-Model-Reasoner-Math-V1 }},
  note={Fine-tuned from Qwen3-0.6B on OpenMathReasoning dataset}
}

APA

ErenalpCet. (2024). E-Model-Reasoner-Math-V1: A Fine-tuned Mathematical Reasoning Model. Hugging Face. https://huggingface.co/ErenalpCet/E-Model-Reasoner-Math-V1

Glossary

  • Thinking Token: Special token (ID: 151668) that separates the model's internal reasoning from its final answer
  • Mathematical Reasoning: The process of logical thinking applied to solve mathematical problems
  • Fine-tuning: Process of adapting a pre-trained model to a specific task or domain
  • bfloat16: Brain floating-point format that provides memory efficiency while maintaining training stability

More Information

For additional technical details, usage examples, and community discussions, visit the model repository at https://huggingface.co/ErenalpCet/E-Model-Reasoner-Math-V1 .

For questions about mathematical reasoning capabilities or specific use cases, please refer to the model's discussion section or create an issue in the repository.

Model Card Authors

ErenalpCet

Model Card Contact

For questions, feedback, or collaboration opportunities regarding E-Model-Reasoner-Math-V1, please contact through the Hugging Face platform or the model repository's discussion section.

Downloads last month
0
Safetensors
Model size
596M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ErenalpCet/E-Model-Reasoner-Math-V1

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(108)
this model

Dataset used to train ErenalpCet/E-Model-Reasoner-Math-V1