See axolotl config
axolotl version: 0.4.1
adapter: lora
auto_resume_from_checkpoints: false
base_model: fxmarty/tiny-random-GemmaForCausalLM
bf16: auto
chat_template: llama3
dataset_prepared_path: null
dataset_processes: 6
datasets:
- data_files:
- 8cfdb1f2cec27bcb_train_data.json
ds_type: json
format: custom
path: /workspace/input_data/8cfdb1f2cec27bcb_train_data.json
type:
field_input: mzs
field_instruction: formula
field_output: smiles
format: '{instruction} {input}'
no_input_format: '{instruction}'
system_format: '{system}'
system_prompt: ''
debug: null
deepspeed: null
early_stopping_patience: 3
eval_max_new_tokens: 128
eval_steps: 200
eval_table_size: null
evals_per_epoch: null
flash_attention: true
fp16: false
fsdp: null
fsdp_config: null
gradient_accumulation_steps: 2
gradient_checkpointing: true
group_by_length: false
hub_model_id: error577/f59415d5-3f46-42ac-8615-c9ed0b877c86
hub_repo: null
hub_strategy: checkpoint
hub_token: null
learning_rate: 0.0002
load_in_4bit: false
load_in_8bit: false
local_rank: null
logging_steps: 1
lora_alpha: 64
lora_dropout: 0.1
lora_fan_in_fan_out: null
lora_model_dir: null
lora_r: 32
lora_target_linear: true
lr_scheduler: cosine
max_grad_norm: 1.0
max_steps: null
micro_batch_size: 5
mlflow_experiment_name: /tmp/8cfdb1f2cec27bcb_train_data.json
model_type: AutoModelForCausalLM
num_epochs: 3
optimizer: adamw_bnb_8bit
output_dir: miner_id_24
pad_to_sequence_len: true
resume_from_checkpoint: null
s2_attention: null
sample_packing: false
save_steps: 200
sequence_len: 256
strict: false
tf32: false
tokenizer_type: AutoTokenizer
train_on_inputs: false
trust_remote_code: true
val_set_size: 0.005
wandb_entity: null
wandb_mode: online
wandb_name: 466b54c0-5c4c-4a4c-8e95-f119cf998ff0
wandb_project: Gradients-On-Demand
wandb_run: your_name
wandb_runid: 466b54c0-5c4c-4a4c-8e95-f119cf998ff0
warmup_steps: 30
weight_decay: 0.0
xformers_attention: null
f59415d5-3f46-42ac-8615-c9ed0b877c86
This model is a fine-tuned version of fxmarty/tiny-random-GemmaForCausalLM on the None dataset. It achieves the following results on the evaluation set:
- Loss: 12.1446
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 5
- eval_batch_size: 5
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 10
- optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 30
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
12.4711 | 0.0000 | 1 | 12.4787 |
12.2458 | 0.0087 | 200 | 12.2547 |
12.2186 | 0.0175 | 400 | 12.2168 |
12.2108 | 0.0262 | 600 | 12.2062 |
12.2047 | 0.0349 | 800 | 12.2000 |
12.2109 | 0.0436 | 1000 | 12.1962 |
12.2102 | 0.0524 | 1200 | 12.1915 |
12.205 | 0.0611 | 1400 | 12.1876 |
12.2112 | 0.0698 | 1600 | 12.1857 |
12.2105 | 0.0785 | 1800 | 12.1828 |
12.201 | 0.0873 | 2000 | 12.1808 |
12.1987 | 0.0960 | 2200 | 12.1781 |
12.1744 | 0.1047 | 2400 | 12.1741 |
12.1621 | 0.1135 | 2600 | 12.1698 |
12.1653 | 0.1222 | 2800 | 12.1665 |
12.1946 | 0.1309 | 3000 | 12.1653 |
12.1671 | 0.1396 | 3200 | 12.1644 |
12.1668 | 0.1484 | 3400 | 12.1638 |
12.1695 | 0.1571 | 3600 | 12.1632 |
12.1686 | 0.1658 | 3800 | 12.1626 |
12.1645 | 0.1746 | 4000 | 12.1626 |
12.1577 | 0.1833 | 4200 | 12.1610 |
12.16 | 0.1920 | 4400 | 12.1604 |
12.1869 | 0.2007 | 4600 | 12.1599 |
12.1604 | 0.2095 | 4800 | 12.1592 |
12.1863 | 0.2182 | 5000 | 12.1584 |
12.1862 | 0.2269 | 5200 | 12.1578 |
12.1725 | 0.2356 | 5400 | 12.1575 |
12.175 | 0.2444 | 5600 | 12.1571 |
12.176 | 0.2531 | 5800 | 12.1568 |
12.1537 | 0.2618 | 6000 | 12.1565 |
12.1648 | 0.2706 | 6200 | 12.1565 |
12.1619 | 0.2793 | 6400 | 12.1558 |
12.1525 | 0.2880 | 6600 | 12.1559 |
12.1509 | 0.2967 | 6800 | 12.1556 |
12.1541 | 0.3055 | 7000 | 12.1554 |
12.1641 | 0.3142 | 7200 | 12.1550 |
12.1652 | 0.3229 | 7400 | 12.1546 |
12.151 | 0.3317 | 7600 | 12.1545 |
12.1712 | 0.3404 | 7800 | 12.1547 |
12.1711 | 0.3491 | 8000 | 12.1544 |
12.1633 | 0.3578 | 8200 | 12.1543 |
12.1359 | 0.3666 | 8400 | 12.1541 |
12.1583 | 0.3753 | 8600 | 12.1532 |
12.1671 | 0.3840 | 8800 | 12.1532 |
12.151 | 0.3927 | 9000 | 12.1528 |
12.18 | 0.4015 | 9200 | 12.1523 |
12.165 | 0.4102 | 9400 | 12.1521 |
12.1582 | 0.4189 | 9600 | 12.1520 |
12.1574 | 0.4277 | 9800 | 12.1520 |
12.1565 | 0.4364 | 10000 | 12.1517 |
12.1704 | 0.4451 | 10200 | 12.1513 |
12.1493 | 0.4538 | 10400 | 12.1509 |
12.145 | 0.4626 | 10600 | 12.1503 |
12.1691 | 0.4713 | 10800 | 12.1500 |
12.1707 | 0.4800 | 11000 | 12.1499 |
12.1365 | 0.4888 | 11200 | 12.1492 |
12.1606 | 0.4975 | 11400 | 12.1492 |
12.1536 | 0.5062 | 11600 | 12.1488 |
12.1682 | 0.5149 | 11800 | 12.1485 |
12.1495 | 0.5237 | 12000 | 12.1484 |
12.153 | 0.5324 | 12200 | 12.1479 |
12.1401 | 0.5411 | 12400 | 12.1481 |
12.1544 | 0.5498 | 12600 | 12.1475 |
12.1771 | 0.5586 | 12800 | 12.1471 |
12.1821 | 0.5673 | 13000 | 12.1469 |
12.1471 | 0.5760 | 13200 | 12.1468 |
12.1544 | 0.5848 | 13400 | 12.1466 |
12.1588 | 0.5935 | 13600 | 12.1465 |
12.1316 | 0.6022 | 13800 | 12.1464 |
12.1473 | 0.6109 | 14000 | 12.1461 |
12.1784 | 0.6197 | 14200 | 12.1458 |
12.1317 | 0.6284 | 14400 | 12.1457 |
12.1707 | 0.6371 | 14600 | 12.1457 |
12.1673 | 0.6459 | 14800 | 12.1458 |
12.1294 | 0.6546 | 15000 | 12.1456 |
12.1368 | 0.6633 | 15200 | 12.1455 |
12.1495 | 0.6720 | 15400 | 12.1452 |
12.1463 | 0.6808 | 15600 | 12.1455 |
12.1472 | 0.6895 | 15800 | 12.1451 |
12.1705 | 0.6982 | 16000 | 12.1451 |
12.1373 | 0.7069 | 16200 | 12.1449 |
12.1503 | 0.7157 | 16400 | 12.1450 |
12.1322 | 0.7244 | 16600 | 12.1449 |
12.1579 | 0.7331 | 16800 | 12.1446 |
12.1375 | 0.7419 | 17000 | 12.1450 |
12.1522 | 0.7506 | 17200 | 12.1449 |
12.1554 | 0.7593 | 17400 | 12.1446 |
Framework versions
- PEFT 0.13.2
- Transformers 4.46.0
- Pytorch 2.5.0+cu124
- Datasets 3.0.1
- Tokenizers 0.20.1
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for error577/f59415d5-3f46-42ac-8615-c9ed0b877c86
Base model
fxmarty/tiny-random-GemmaForCausalLM