Echiguerkh's picture
update model card README.md
6471df9
metadata
license: mit
tags:
  - generated_from_trainer
datasets:
  - arcd
model-index:
  - name: rinna-roberta-qa-ar2
    results: []

rinna-roberta-qa-ar2

This model is a fine-tuned version of xlm-roberta-base on the arcd dataset. It achieves the following results on the evaluation set:

  • Loss: 7.3167

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 7e-05
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 170

Training results

Training Loss Epoch Step Validation Loss
0.3148 6.86 150 4.5451
0.2021 13.71 300 4.3560
0.1134 20.57 450 5.1730
0.0648 27.43 600 5.0504
0.0734 34.29 750 5.3601
0.032 41.14 900 5.4291
0.0171 48.0 1050 6.9606
0.0343 54.86 1200 4.9076
0.0186 61.71 1350 6.7967
0.0054 68.57 1500 6.0515
0.0118 75.43 1650 7.0908
0.0027 82.29 1800 7.5651
0.0078 89.14 1950 7.3787
0.0172 96.0 2100 7.7559
0.0077 102.86 2250 7.1376
0.0041 109.71 2400 7.3236
0.0022 116.57 2550 7.3134
0.0004 123.43 2700 7.2484
0.0018 130.29 2850 7.1747
0.0009 137.14 3000 7.4311
0.0008 144.0 3150 7.5083
0.0006 150.86 3300 7.4622
0.0002 157.71 3450 7.3167

Framework versions

  • Transformers 4.29.2
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3