Built with Axolotl

See axolotl config

axolotl version: 0.4.1

base_model: Qwen/Qwen2.5-7B-Instruct
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:
  - path: Jennny/strict_mc_label
    conversation: qwen-7b-chat
    type: sharegpt
    split: "train"
    train_on_split: "train"

warmup_ratio: 0.05
val_set_size: 0.0
output_dir: ./prm
wandb_project: preference-models
# wandb_entity: domain-generalization
wandb_watch:
wandb_name: "qwen-7b-bs32_lr2e-6_prm"
wandb_log_model:

train_on_inputs: false

save_safetensors: true
#noisy_embedding_alpha: 10.0 # default for sharegpt type
dataset_prepared_path: ~/data/preference-models/last_run_prepared

dataset_processes: 48
#torch_compile: true
sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true

trust_remote_code: True
adapter:
lora_model_dir:
#lora_r: 32
#lora_alpha: 16
#lora_dropout: 0.05
#lora_target_linear: true
#lora_fan_in_fan_out:

gradient_checkpointing: True

#warmup_ratio: 0.1
gradient_accumulation_steps: 4
micro_batch_size: 1
num_epochs: 1
#max_steps: 10
#optimizer: adamw_torch_fused
optimizer: paged_adamw_32bit
#lr_scheduler: constant_with_warmup
lr_scheduler: cosine
learning_rate: 2.0e-6

weight_decay: 0.0
max_grad_norm: 1.0

group_by_length: false
bf16: auto
fp16: false
tf32: true

early_stopping_patience:
local_rank:
logging_steps: 2
xformers_attention:
flash_attention: true

eval_steps:
eval_table_size:
eval_table_max_new_tokens:
#save_steps: 100
save_strategy: "epoch"
save_total_limit: 4
#save_safetensors: false
debug:

ddp: #true
deepspeed: #deepspeed/zero1.json # multi-gpu only

fsdp:
fsdp_config:
special_tokens:
  pad_token: <|end_of_text|>

prm

This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0455

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 3
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
No log 0.0296 1 3.5810
3.6289 0.0593 2 2.9033
3.6289 0.0889 3 1.3921
2.1197 0.1185 4 0.4445
2.1197 0.1481 5 0.2438
0.3612 0.1778 6 0.1210
0.3612 0.2074 7 0.0613
0.0928 0.2370 8 0.1151
0.0928 0.2667 9 0.0640
0.0827 0.2963 10 0.0762
0.0827 0.3259 11 0.0631
0.0682 0.3556 12 0.0576
0.0682 0.3852 13 0.0564
0.0509 0.4148 14 0.0546
0.0509 0.4444 15 0.0559
0.0579 0.4741 16 0.0539
0.0579 0.5037 17 0.0511
0.0509 0.5333 18 0.0535
0.0509 0.5630 19 0.0516
0.0495 0.5926 20 0.0504
0.0495 0.6222 21 0.0556
0.0509 0.6519 22 0.0559
0.0509 0.6815 23 0.0541
0.0995 0.7111 24 0.0495
0.0995 0.7407 25 0.0500
0.0473 0.7704 26 0.0502
0.0473 0.8 27 0.0503
0.0486 0.8296 28 0.0494
0.0486 0.8593 29 0.0492
0.0502 0.8889 30 0.0488
0.0502 0.9185 31 0.0493
0.071 0.9481 32 0.0483
0.071 0.9778 33 0.0477
0.0467 1.0074 34 0.0485
0.0467 1.0148 35 0.0492
0.0439 1.0444 36 0.0489
0.0439 1.0741 37 0.0483
0.0407 1.1037 38 0.0476
0.0407 1.1333 39 0.0468
0.0464 1.1630 40 0.0464
0.0464 1.1926 41 0.0460
0.0434 1.2222 42 0.0460
0.0434 1.2519 43 0.0465
0.0455 1.2815 44 0.0463
0.0455 1.3111 45 0.0461
0.048 1.3407 46 0.0460
0.048 1.3704 47 0.0459
0.0446 1.4 48 0.0458
0.0446 1.4296 49 0.0456
0.0481 1.4593 50 0.0457
0.0481 1.4889 51 0.0456
0.0432 1.5185 52 0.0456
0.0432 1.5481 53 0.0456
0.0416 1.5778 54 0.0456
0.0416 1.6074 55 0.0456
0.0424 1.6370 56 0.0455
0.0424 1.6667 57 0.0455
0.044 1.6963 58 0.0456
0.044 1.7259 59 0.0455
0.0422 1.7556 60 0.0455
0.0422 1.7852 61 0.0455
0.0419 1.8148 62 0.0455
0.0419 1.8444 63 0.0455
0.0431 1.8741 64 0.0455
0.0431 1.9037 65 0.0456
0.0396 1.9333 66 0.0455

Framework versions

  • Transformers 4.43.3
  • Pytorch 2.1.2+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
129
Safetensors
Model size
7.62B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Jennny/inclusive_mc_label

Base model

Qwen/Qwen2.5-7B
Finetuned
(2455)
this model