Qwen/Qwen2.5-0.5B-Instruct fine-tuned with mix of synthetic and real data.

  • 1 epoch of SFT (5000 samples)
  • Optimizer: PagedAdamW8bit
  • Learning rate: 2e-5
  • Batch size: 16
  • Sample length: 1024 tokens
Downloads last month
6
Safetensors
Model size
630M params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mhl1/Qwen2.5-0.5B-cinstruct-stage1

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(403)
this model