barthez-deft-chimie

This model is a fine-tuned version of moussaKam/barthez on an unknown dataset.

Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using MBartHez and HAL/Wiki pre-training + copy mechanisms)

It achieves the following results on the evaluation set:

Loss: 2.0710
Rouge1: 31.8947
Rouge2: 16.7563
Rougel: 23.5428
Rougelsum: 23.4918
Gen Len: 38.5256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
3.8022	1.0	118	2.5491	16.8208	7.0027	13.957	14.0479	19.1538
2.9286	2.0	236	2.3074	17.5356	7.8717	14.4874	14.5044	19.9487
2.5422	3.0	354	2.2322	19.6491	9.4156	15.9467	15.9433	19.7051
2.398	4.0	472	2.1500	18.7166	9.859	15.7535	15.8036	19.9231
2.2044	5.0	590	2.1372	19.978	10.6235	16.1348	16.1274	19.6154
1.9405	6.0	708	2.0992	20.226	10.551	16.6928	16.7211	19.9744
1.8544	7.0	826	2.0841	19.8869	10.8456	16.1072	16.097	19.8846
1.7536	8.0	944	2.0791	19.3017	9.4921	16.1541	16.2167	19.859
1.6914	9.0	1062	2.0710	21.3848	10.4088	17.1963	17.2254	19.8846
1.654	10.0	1180	2.1069	22.3811	10.7987	18.7595	18.761	19.9231
1.5899	11.0	1298	2.0919	20.8546	10.6958	16.8637	16.9499	19.8077
1.4661	12.0	1416	2.1065	22.3677	11.7472	18.262	18.3	19.9744
1.4205	13.0	1534	2.1164	20.5845	10.7825	16.9972	17.0216	19.9359
1.3797	14.0	1652	2.1240	22.2561	11.303	17.5064	17.5815	19.9744
1.3724	15.0	1770	2.1187	23.2825	11.912	18.5208	18.5499	19.9359
1.3404	16.0	1888	2.1394	22.1305	10.5258	17.772	17.8202	19.9744
1.2846	17.0	2006	2.1502	21.567	11.0557	17.2562	17.2974	20.0
1.2871	18.0	2124	2.1572	22.5871	11.702	18.2906	18.3826	19.9744
1.2422	19.0	2242	2.1613	23.0935	11.6824	18.6087	18.6777	19.9744
1.2336	20.0	2360	2.1581	22.6789	11.4363	18.1661	18.2346	19.9487

Framework versions

Transformers 4.10.2
Pytorch 1.7.1+cu110
Datasets 1.11.0
Tokenizers 0.10.3

jogonba2
/

barthez-deft-chimie

barthez-deft-chimie

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results