Built with Axolotl

See axolotl config

axolotl version: 0.4.1

adapter: lora
auto_resume_from_checkpoints: false
base_model: fxmarty/tiny-random-GemmaForCausalLM
bf16: auto
chat_template: llama3
dataset_prepared_path: null
dataset_processes: 6
datasets:
- data_files:
  - 8cfdb1f2cec27bcb_train_data.json
  ds_type: json
  format: custom
  path: /workspace/input_data/8cfdb1f2cec27bcb_train_data.json
  type:
    field_input: mzs
    field_instruction: formula
    field_output: smiles
    format: '{instruction} {input}'
    no_input_format: '{instruction}'
    system_format: '{system}'
    system_prompt: ''
debug: null
deepspeed: null
early_stopping_patience: 3
eval_max_new_tokens: 128
eval_steps: 200
eval_table_size: null
evals_per_epoch: null
flash_attention: true
fp16: false
fsdp: null
fsdp_config: null
gradient_accumulation_steps: 2
gradient_checkpointing: true
group_by_length: false
hub_model_id: error577/f59415d5-3f46-42ac-8615-c9ed0b877c86
hub_repo: null
hub_strategy: checkpoint
hub_token: null
learning_rate: 0.0002
load_in_4bit: false
load_in_8bit: false
local_rank: null
logging_steps: 1
lora_alpha: 64
lora_dropout: 0.1
lora_fan_in_fan_out: null
lora_model_dir: null
lora_r: 32
lora_target_linear: true
lr_scheduler: cosine
max_grad_norm: 1.0
max_steps: null
micro_batch_size: 5
mlflow_experiment_name: /tmp/8cfdb1f2cec27bcb_train_data.json
model_type: AutoModelForCausalLM
num_epochs: 3
optimizer: adamw_bnb_8bit
output_dir: miner_id_24
pad_to_sequence_len: true
resume_from_checkpoint: null
s2_attention: null
sample_packing: false
save_steps: 200
sequence_len: 256
strict: false
tf32: false
tokenizer_type: AutoTokenizer
train_on_inputs: false
trust_remote_code: true
val_set_size: 0.005
wandb_entity: null
wandb_mode: online
wandb_name: 466b54c0-5c4c-4a4c-8e95-f119cf998ff0
wandb_project: Gradients-On-Demand
wandb_run: your_name
wandb_runid: 466b54c0-5c4c-4a4c-8e95-f119cf998ff0
warmup_steps: 30
weight_decay: 0.0
xformers_attention: null

f59415d5-3f46-42ac-8615-c9ed0b877c86

This model is a fine-tuned version of fxmarty/tiny-random-GemmaForCausalLM on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 12.1446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 5
  • eval_batch_size: 5
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 10
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 30
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
12.4711 0.0000 1 12.4787
12.2458 0.0087 200 12.2547
12.2186 0.0175 400 12.2168
12.2108 0.0262 600 12.2062
12.2047 0.0349 800 12.2000
12.2109 0.0436 1000 12.1962
12.2102 0.0524 1200 12.1915
12.205 0.0611 1400 12.1876
12.2112 0.0698 1600 12.1857
12.2105 0.0785 1800 12.1828
12.201 0.0873 2000 12.1808
12.1987 0.0960 2200 12.1781
12.1744 0.1047 2400 12.1741
12.1621 0.1135 2600 12.1698
12.1653 0.1222 2800 12.1665
12.1946 0.1309 3000 12.1653
12.1671 0.1396 3200 12.1644
12.1668 0.1484 3400 12.1638
12.1695 0.1571 3600 12.1632
12.1686 0.1658 3800 12.1626
12.1645 0.1746 4000 12.1626
12.1577 0.1833 4200 12.1610
12.16 0.1920 4400 12.1604
12.1869 0.2007 4600 12.1599
12.1604 0.2095 4800 12.1592
12.1863 0.2182 5000 12.1584
12.1862 0.2269 5200 12.1578
12.1725 0.2356 5400 12.1575
12.175 0.2444 5600 12.1571
12.176 0.2531 5800 12.1568
12.1537 0.2618 6000 12.1565
12.1648 0.2706 6200 12.1565
12.1619 0.2793 6400 12.1558
12.1525 0.2880 6600 12.1559
12.1509 0.2967 6800 12.1556
12.1541 0.3055 7000 12.1554
12.1641 0.3142 7200 12.1550
12.1652 0.3229 7400 12.1546
12.151 0.3317 7600 12.1545
12.1712 0.3404 7800 12.1547
12.1711 0.3491 8000 12.1544
12.1633 0.3578 8200 12.1543
12.1359 0.3666 8400 12.1541
12.1583 0.3753 8600 12.1532
12.1671 0.3840 8800 12.1532
12.151 0.3927 9000 12.1528
12.18 0.4015 9200 12.1523
12.165 0.4102 9400 12.1521
12.1582 0.4189 9600 12.1520
12.1574 0.4277 9800 12.1520
12.1565 0.4364 10000 12.1517
12.1704 0.4451 10200 12.1513
12.1493 0.4538 10400 12.1509
12.145 0.4626 10600 12.1503
12.1691 0.4713 10800 12.1500
12.1707 0.4800 11000 12.1499
12.1365 0.4888 11200 12.1492
12.1606 0.4975 11400 12.1492
12.1536 0.5062 11600 12.1488
12.1682 0.5149 11800 12.1485
12.1495 0.5237 12000 12.1484
12.153 0.5324 12200 12.1479
12.1401 0.5411 12400 12.1481
12.1544 0.5498 12600 12.1475
12.1771 0.5586 12800 12.1471
12.1821 0.5673 13000 12.1469
12.1471 0.5760 13200 12.1468
12.1544 0.5848 13400 12.1466
12.1588 0.5935 13600 12.1465
12.1316 0.6022 13800 12.1464
12.1473 0.6109 14000 12.1461
12.1784 0.6197 14200 12.1458
12.1317 0.6284 14400 12.1457
12.1707 0.6371 14600 12.1457
12.1673 0.6459 14800 12.1458
12.1294 0.6546 15000 12.1456
12.1368 0.6633 15200 12.1455
12.1495 0.6720 15400 12.1452
12.1463 0.6808 15600 12.1455
12.1472 0.6895 15800 12.1451
12.1705 0.6982 16000 12.1451
12.1373 0.7069 16200 12.1449
12.1503 0.7157 16400 12.1450
12.1322 0.7244 16600 12.1449
12.1579 0.7331 16800 12.1446
12.1375 0.7419 17000 12.1450
12.1522 0.7506 17200 12.1449
12.1554 0.7593 17400 12.1446

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for error577/f59415d5-3f46-42ac-8615-c9ed0b877c86

Adapter
(218)
this model