This is a fine-tuned version of the base Qwen/Qwen3-0.6B-Base, trained on 100 data from mathQA.
learning_rate = 5e-5
per_device_train_batch_size = 1
num_train_epochs = 1
optimiser = adamw_torch
Chat template
Files info