abte-restaurants-distilbert-base-uncased

This model is a fine-tuned version of distilbert/distilbert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2387
F1-score: 0.8272

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 256
eval_batch_size: 256
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	F1-score
0.6621	1.0	15	0.5167	0.0084
0.3639	2.0	30	0.3092	0.5360
0.2349	3.0	45	0.2581	0.6078
0.1794	4.0	60	0.2314	0.6587
0.1444	5.0	75	0.2149	0.7298
0.1102	6.0	90	0.2024	0.7654
0.0903	7.0	105	0.2036	0.7991
0.076	8.0	120	0.2047	0.8189
0.0642	9.0	135	0.2067	0.8163
0.0543	10.0	150	0.2133	0.8208
0.0493	11.0	165	0.2153	0.8191
0.0426	12.0	180	0.2186	0.8225
0.0403	13.0	195	0.2258	0.8249
0.0374	14.0	210	0.2286	0.8225
0.0348	15.0	225	0.2286	0.8245
0.0318	16.0	240	0.2347	0.8250
0.0305	17.0	255	0.2351	0.8265
0.0296	18.0	270	0.2356	0.8260
0.0292	19.0	285	0.2371	0.8275
0.0285	20.0	300	0.2387	0.8272

Framework versions

Transformers 4.48.2
Pytorch 2.5.1+cu124
Datasets 3.2.0
Tokenizers 0.21.0

thainq107
/

abte-restaurants-distilbert-base-uncased

abte-restaurants-distilbert-base-uncased

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for thainq107/abte-restaurants-distilbert-base-uncased

Evaluation results