tfa_output_2025_m05_d12_t23h_28m_45s
This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.1152
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-07
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
No log | 0 | 0 | 1.1187 |
2.2656 | 0.0101 | 50 | 1.1188 |
2.2587 | 0.0203 | 100 | 1.1188 |
2.0947 | 0.0304 | 150 | 1.1187 |
2.4375 | 0.0406 | 200 | 1.1184 |
2.1723 | 0.0507 | 250 | 1.1182 |
2.1954 | 0.0609 | 300 | 1.1181 |
2.2557 | 0.0710 | 350 | 1.1179 |
2.1674 | 0.0811 | 400 | 1.1177 |
2.2688 | 0.0913 | 450 | 1.1174 |
2.2975 | 0.1014 | 500 | 1.1172 |
2.2598 | 0.1116 | 550 | 1.1171 |
2.1883 | 0.1217 | 600 | 1.1169 |
2.3749 | 0.1318 | 650 | 1.1168 |
2.2632 | 0.1420 | 700 | 1.1166 |
2.2098 | 0.1521 | 750 | 1.1166 |
2.512 | 0.1623 | 800 | 1.1165 |
2.1788 | 0.1724 | 850 | 1.1164 |
2.283 | 0.1826 | 900 | 1.1164 |
2.3025 | 0.1927 | 950 | 1.1165 |
2.3623 | 0.2028 | 1000 | 1.1163 |
2.0487 | 0.2130 | 1050 | 1.1163 |
2.2132 | 0.2231 | 1100 | 1.1162 |
2.4063 | 0.2333 | 1150 | 1.1163 |
2.3643 | 0.2434 | 1200 | 1.1162 |
2.3078 | 0.2535 | 1250 | 1.1162 |
2.3099 | 0.2637 | 1300 | 1.1160 |
2.2832 | 0.2738 | 1350 | 1.1161 |
2.1349 | 0.2840 | 1400 | 1.1161 |
2.2476 | 0.2941 | 1450 | 1.1159 |
2.2239 | 0.3043 | 1500 | 1.1159 |
2.2536 | 0.3144 | 1550 | 1.1159 |
2.4013 | 0.3245 | 1600 | 1.1157 |
2.4099 | 0.3347 | 1650 | 1.1158 |
2.2071 | 0.3448 | 1700 | 1.1159 |
2.2273 | 0.3550 | 1750 | 1.1159 |
2.4407 | 0.3651 | 1800 | 1.1158 |
2.1962 | 0.3753 | 1850 | 1.1159 |
2.4663 | 0.3854 | 1900 | 1.1159 |
2.4407 | 0.3955 | 1950 | 1.1160 |
2.0845 | 0.4057 | 2000 | 1.1158 |
2.5151 | 0.4158 | 2050 | 1.1158 |
2.5408 | 0.4260 | 2100 | 1.1157 |
2.5447 | 0.4361 | 2150 | 1.1156 |
2.2343 | 0.4462 | 2200 | 1.1157 |
2.2359 | 0.4564 | 2250 | 1.1157 |
2.3676 | 0.4665 | 2300 | 1.1157 |
2.3005 | 0.4767 | 2350 | 1.1156 |
2.1009 | 0.4868 | 2400 | 1.1155 |
2.2853 | 0.4970 | 2450 | 1.1156 |
2.0989 | 0.5071 | 2500 | 1.1158 |
2.2403 | 0.5172 | 2550 | 1.1156 |
2.0566 | 0.5274 | 2600 | 1.1157 |
2.2001 | 0.5375 | 2650 | 1.1155 |
2.507 | 0.5477 | 2700 | 1.1155 |
2.3462 | 0.5578 | 2750 | 1.1157 |
2.13 | 0.5680 | 2800 | 1.1155 |
2.393 | 0.5781 | 2850 | 1.1157 |
2.21 | 0.5882 | 2900 | 1.1155 |
2.1797 | 0.5984 | 2950 | 1.1155 |
2.0194 | 0.6085 | 3000 | 1.1156 |
2.2226 | 0.6187 | 3050 | 1.1155 |
2.3258 | 0.6288 | 3100 | 1.1156 |
2.1823 | 0.6389 | 3150 | 1.1155 |
2.0575 | 0.6491 | 3200 | 1.1154 |
2.2928 | 0.6592 | 3250 | 1.1156 |
2.2332 | 0.6694 | 3300 | 1.1154 |
2.2784 | 0.6795 | 3350 | 1.1155 |
2.4014 | 0.6897 | 3400 | 1.1155 |
2.2708 | 0.6998 | 3450 | 1.1155 |
2.2886 | 0.7099 | 3500 | 1.1153 |
2.4274 | 0.7201 | 3550 | 1.1154 |
2.1011 | 0.7302 | 3600 | 1.1154 |
2.2618 | 0.7404 | 3650 | 1.1154 |
2.3452 | 0.7505 | 3700 | 1.1153 |
2.5666 | 0.7606 | 3750 | 1.1153 |
2.3546 | 0.7708 | 3800 | 1.1153 |
2.2997 | 0.7809 | 3850 | 1.1154 |
2.1488 | 0.7911 | 3900 | 1.1152 |
2.2078 | 0.8012 | 3950 | 1.1152 |
2.379 | 0.8114 | 4000 | 1.1154 |
2.2763 | 0.8215 | 4050 | 1.1155 |
2.2836 | 0.8316 | 4100 | 1.1153 |
2.3352 | 0.8418 | 4150 | 1.1153 |
2.4465 | 0.8519 | 4200 | 1.1154 |
2.2012 | 0.8621 | 4250 | 1.1153 |
2.1785 | 0.8722 | 4300 | 1.1151 |
2.1904 | 0.8824 | 4350 | 1.1153 |
2.3697 | 0.8925 | 4400 | 1.1153 |
2.2069 | 0.9026 | 4450 | 1.1152 |
1.9517 | 0.9128 | 4500 | 1.1153 |
2.3188 | 0.9229 | 4550 | 1.1153 |
2.336 | 0.9331 | 4600 | 1.1154 |
1.9878 | 0.9432 | 4650 | 1.1152 |
2.4256 | 0.9533 | 4700 | 1.1152 |
2.3003 | 0.9635 | 4750 | 1.1152 |
2.6227 | 0.9736 | 4800 | 1.1151 |
2.3439 | 0.9838 | 4850 | 1.1153 |
2.118 | 0.9939 | 4900 | 1.1152 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.1.2+cu121
- Datasets 3.6.0
- Tokenizers 0.21.1
- Downloads last month
- 14
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for brando/tfa_output_2025_m05_d12_t23h_28m_45s
Base model
meta-llama/Meta-Llama-3-8B-Instruct