Built with Axolotl

See axolotl config

axolotl version: 0.4.1

adapter: lora
base_model: unsloth/tinyllama-chat
bf16: true
chat_template: llama3
dataset_prepared_path: null
datasets:
- data_files:
  - daed85532ae01daa_train_data.json
  ds_type: json
  format: custom
  path: /workspace/input_data/daed85532ae01daa_train_data.json
  type:
    field_input: input
    field_instruction: instruction
    field_output: output
    format: '{instruction} {input}'
    no_input_format: '{instruction}'
    system_format: '{system}'
    system_prompt: ''
debug: null
device_map:
  ? ''
  : 0,1,2,3,4,5,6,7
early_stopping_patience: 2
eval_max_new_tokens: 128
eval_steps: 400
eval_table_size: null
flash_attention: true
gradient_accumulation_steps: 4
gradient_checkpointing: true
group_by_length: false
hub_model_id: Alphatao/35fb927f-d4dd-4902-b051-fdae57931bbf
hub_repo: null
hub_strategy: null
hub_token: null
learning_rate: 0.0002
load_best_model_at_end: true
load_in_4bit: false
load_in_8bit: false
local_rank: null
logging_steps: 1
lora_alpha: 32
lora_dropout: 0.05
lora_fan_in_fan_out: null
lora_model_dir: null
lora_r: 16
lora_target_linear: true
lora_target_modules:
- q_proj
- k_proj
- v_proj
- o_proj
- down_proj
- up_proj
lr_scheduler: cosine
max_grad_norm: 1.0
max_steps: 14192
micro_batch_size: 2
mlflow_experiment_name: /tmp/daed85532ae01daa_train_data.json
model_type: AutoModelForCausalLM
num_epochs: 2
optimizer: adamw_bnb_8bit
output_dir: miner_id_24
pad_to_sequence_len: true
resume_from_checkpoint: null
s2_attention: null
sample_packing: false
save_steps: 400
sequence_len: 2048
strict: false
tf32: true
tokenizer_type: AutoTokenizer
train_on_inputs: false
trust_remote_code: true
val_set_size: 0.05
wandb_entity: null
wandb_mode: online
wandb_name: d7c344bd-e406-41ee-ad84-5b799adb7e49
wandb_project: Gradients-On-Demand
wandb_run: your_name
wandb_runid: d7c344bd-e406-41ee-ad84-5b799adb7e49
warmup_steps: 10
weight_decay: 0.0
xformers_attention: null

35fb927f-d4dd-4902-b051-fdae57931bbf

This model is a fine-tuned version of unsloth/tinyllama-chat on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1144

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 10
  • training_steps: 14192

Training results

Training Loss Epoch Step Validation Loss
1.6758 0.0001 1 1.8920
1.3241 0.0413 400 1.6002
1.7018 0.0826 800 1.5184
1.8543 0.1239 1200 1.4654
1.292 0.1652 1600 1.4236
1.5167 0.2065 2000 1.3890
1.3281 0.2478 2400 1.3660
1.376 0.2891 2800 1.3410
1.366 0.3304 3200 1.3217
1.3738 0.3717 3600 1.3012
1.4379 0.4130 4000 1.2837
1.2683 0.4543 4400 1.2698
1.2872 0.4956 4800 1.2540
1.2 0.5369 5200 1.2423
1.2733 0.5782 5600 1.2279
1.5039 0.6195 6000 1.2165
1.2798 0.6608 6400 1.2074
1.7364 0.7022 6800 1.1982
1.0885 0.7435 7200 1.1857
0.9772 0.7848 7600 1.1745
1.1519 0.8261 8000 1.1681
1.2283 0.8674 8400 1.1586
0.9801 0.9087 8800 1.1494
1.1179 0.9500 9200 1.1420
1.069 0.9913 9600 1.1351
1.1033 1.0326 10000 1.1340
0.4977 1.0739 10400 1.1299
0.9901 1.1152 10800 1.1269
1.3081 1.1565 11200 1.1241
0.985 1.1978 11600 1.1208
1.088 1.2391 12000 1.1191
1.1772 1.2804 12400 1.1167
1.2013 1.3217 12800 1.1155
1.0287 1.3630 13200 1.1150
0.9724 1.4043 13600 1.1145
0.9901 1.4456 14000 1.1144

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Alphatao/35fb927f-d4dd-4902-b051-fdae57931bbf

Adapter
(265)
this model