YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Qwen2.5-7B-SFT-S1.1-EP2
A fine-tuned version of Qwen2.5-7B (Base) on simplescaling/s1K-1.1
dataset (deepseek demonstrations).
Epoch = 2
max_length: 32768
weight_decay: 0.0001
optim: adamw_torch
lr_scheduler_type: cosine
warmup_ratio: 0.1
learning_rate: 1.0e-05
gradient_accumulation_steps: 1
per_device_eval_batch_size: 2
per_device_train_batch_size: 2
# SFT trainer config
max_steps: -1
num_train_epochs: 5
bf16: true
do_eval: false
use_liger_kernel: true
eval_strategy: 'no'
gradient_checkpointing: true
gradient_checkpointing_kwargs:
use_reentrant: false
log_level: info
logging_steps: 5
logging_strategy: steps
packing: false
output_dir: models/qwen2.5-7b-base/s1.1-1k
overwrite_output_dir: true
push_to_hub: false
report_to:
- wandb
save_strategy: "epoch"
save_total_limit: 5
save_only_model: true
seed: 42
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support