---
license: apache-2.0
tags:
- generated_from_trainer
model-index:
- name: bert-base-uncased-issues-128
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# bert-base-uncased-issues-128

This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.2512

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 16

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 2.1019        | 1.0   | 291  | 1.7019          |
| 1.6412        | 2.0   | 582  | 1.4273          |
| 1.4844        | 3.0   | 873  | 1.3947          |
| 1.4006        | 4.0   | 1164 | 1.3698          |
| 1.3382        | 5.0   | 1455 | 1.1941          |
| 1.2822        | 6.0   | 1746 | 1.2781          |
| 1.2393        | 7.0   | 2037 | 1.2650          |
| 1.2009        | 8.0   | 2328 | 1.2082          |
| 1.1657        | 9.0   | 2619 | 1.1776          |
| 1.1394        | 10.0  | 2910 | 1.2050          |
| 1.1276        | 11.0  | 3201 | 1.2067          |
| 1.1051        | 12.0  | 3492 | 1.1630          |
| 1.0814        | 13.0  | 3783 | 1.2529          |
| 1.0757        | 14.0  | 4074 | 1.1699          |
| 1.063         | 15.0  | 4365 | 1.1113          |
| 1.0637        | 16.0  | 4656 | 1.2512          |


### Framework versions

- Transformers 4.11.3
- Pytorch 1.11.0+cu113
- Datasets 1.16.1
- Tokenizers 0.10.1

## Model Recycling

[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=1.50&mnli_lp=nan&20_newsgroup=1.15&ag_news=0.14&amazon_reviews_multi=-0.06&anli=0.80&boolq=2.51&cb=7.05&cola=0.82&copa=9.55&dbpedia=0.44&esnli=0.64&financial_phrasebank=10.97&imdb=-0.14&isear=-0.04&mnli=-0.16&mrpc=1.35&multirc=1.23&poem_sentiment=0.63&qnli=0.53&qqp=-0.54&rotten_tomatoes=0.42&rte=4.64&sst2=0.00&sst_5bins=0.01&stsb=0.57&trec_coarse=0.54&trec_fine=8.67&tweet_ev_emoji=0.40&tweet_ev_emotion=0.45&tweet_ev_hate=0.18&tweet_ev_irony=-0.80&tweet_ev_offensive=-0.25&tweet_ev_sentiment=0.54&wic=-0.87&wnli=1.55&wsc=1.35&yahoo_answers=-0.32&model_name=jmassot%2Fbert-base-uncased-issues-128&base_name=bert-base-uncased) using jmassot/bert-base-uncased-issues-128 as a base model yields average score of 73.70 in comparison to 72.20 by bert-base-uncased.

The model is ranked 3rd among all tested models for the bert-base-uncased architecture as of 21/12/2022
Results:

|   20_newsgroup |   ag_news |   amazon_reviews_multi |   anli |   boolq |      cb |    cola |   copa |   dbpedia |   esnli |   financial_phrasebank |   imdb |   isear |    mnli |    mrpc |   multirc |   poem_sentiment |    qnli |     qqp |   rotten_tomatoes |     rte |    sst2 |   sst_5bins |    stsb |   trec_coarse |   trec_fine |   tweet_ev_emoji |   tweet_ev_emotion |   tweet_ev_hate |   tweet_ev_irony |   tweet_ev_offensive |   tweet_ev_sentiment |     wic |    wnli |     wsc |   yahoo_answers |
|---------------:|----------:|-----------------------:|-------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:|
|        84.2007 |   89.7333 |                  65.86 |  47.75 | 71.4679 | 71.4286 | 82.6462 |     59 |      78.6 |   90.34 |                   79.5 | 91.432 | 69.0352 | 83.5639 | 83.3333 |   61.2005 |          67.3077 | 90.4082 | 89.7353 |            85.272 | 64.6209 | 91.9725 |     52.8054 | 86.4351 |          96.6 |          77 |            36.41 |            80.3659 |         53.0303 |          66.9643 |              85.1163 |              70.0179 | 62.3824 | 52.1127 | 63.4615 |              72 |


For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)