Qwen3-0.6B-Math-Expert

This project performs full fine-tuning on the Qwen3-0.6B language model to enhance its mathematical problem-solving and reasoning capabilities. Training was conducted exclusively on the OpenMathReasoning-mini dataset, and the model was optimized using the bfloat16 (bf16) data type.

Training Procedure

  1. Dataset Preparation

    • The unsloth/OpenMathReasoning-mini dataset was used.
    • Each example was formatted in Chain-of-Thought (CoT) style, pairing math problems with step-by-step intermediate reasoning.
  2. Model Loading and Configuration

    • Qwen3 base model weights were loaded via the unsloth library in bf16 precision.
    • All layers were updated (full_finetuning=True) to adapt the model for mathematical reasoning.
  3. Supervised Fine-Tuning

    • Leveraged the Hugging Face TRL library with the Supervised Fine-Tuning (SFT) approach.
    • The model was trained to generate both correct answers and corresponding reasoning chains.

Purpose and Outcome

  • The model’s reasoning capacity for math problems was significantly improved through single-dataset, full fine-tuning in bf16 precision.
  • Outputs include both intermediate reasoning steps and final solutions, providing transparent and interpretable results.

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Support

Buy Me A Coffee

Downloads last month
96
Safetensors
Model size
596M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for suayptalha/Qwen3-0.6B-Math-Expert

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(108)
this model
Finetunes
1 model
Quantizations
1 model

Dataset used to train suayptalha/Qwen3-0.6B-Math-Expert

Collection including suayptalha/Qwen3-0.6B-Math-Expert