train_cola_1752826681

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

Loss: 0.2519
Num Input Tokens Seen: 3669168

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 123
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
4.1414	0.5	962	4.1504	184192
0.329	1.0	1924	0.3631	367320
0.3353	1.5	2886	0.2873	550840
0.2399	2.0	3848	0.2674	734600
0.2748	2.5	4810	0.2616	918600
0.2456	3.0	5772	0.2645	1101216
0.2411	3.5	6734	0.2572	1284288
0.2552	4.0	7696	0.2542	1468552
0.2165	4.5	8658	0.2530	1651528
0.2468	5.0	9620	0.2570	1834816
0.2579	5.5	10582	0.2528	2018016
0.2628	6.0	11544	0.2549	2201584
0.2469	6.5	12506	0.2566	2385200
0.2354	7.0	13468	0.2551	2568288
0.2696	7.5	14430	0.2539	2751584
0.2419	8.0	15392	0.2536	2935056
0.2689	8.5	16354	0.2519	3118000
0.2314	9.0	17316	0.2529	3301760
0.2543	9.5	18278	0.2531	3485344
0.2502	10.0	19240	0.2531	3669168

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.7.1+cu126
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_1752826681

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(1395)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard