--- license: apache-2.0 tags: - generated_from_trainer model-index: - name: bert-base-uncased-issues-128 results: [] --- # bert-base-uncased-issues-128 This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the None dataset. It achieves the following results on the evaluation set: - Loss: 1.2512 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 32 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 16 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 2.1019 | 1.0 | 291 | 1.7019 | | 1.6412 | 2.0 | 582 | 1.4273 | | 1.4844 | 3.0 | 873 | 1.3947 | | 1.4006 | 4.0 | 1164 | 1.3698 | | 1.3382 | 5.0 | 1455 | 1.1941 | | 1.2822 | 6.0 | 1746 | 1.2781 | | 1.2393 | 7.0 | 2037 | 1.2650 | | 1.2009 | 8.0 | 2328 | 1.2082 | | 1.1657 | 9.0 | 2619 | 1.1776 | | 1.1394 | 10.0 | 2910 | 1.2050 | | 1.1276 | 11.0 | 3201 | 1.2067 | | 1.1051 | 12.0 | 3492 | 1.1630 | | 1.0814 | 13.0 | 3783 | 1.2529 | | 1.0757 | 14.0 | 4074 | 1.1699 | | 1.063 | 15.0 | 4365 | 1.1113 | | 1.0637 | 16.0 | 4656 | 1.2512 | ### Framework versions - Transformers 4.11.3 - Pytorch 1.11.0+cu113 - Datasets 1.16.1 - Tokenizers 0.10.1 ## Model Recycling [Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=1.50&mnli_lp=nan&20_newsgroup=1.15&ag_news=0.14&amazon_reviews_multi=-0.06&anli=0.80&boolq=2.51&cb=7.05&cola=0.82&copa=9.55&dbpedia=0.44&esnli=0.64&financial_phrasebank=10.97&imdb=-0.14&isear=-0.04&mnli=-0.16&mrpc=1.35&multirc=1.23&poem_sentiment=0.63&qnli=0.53&qqp=-0.54&rotten_tomatoes=0.42&rte=4.64&sst2=0.00&sst_5bins=0.01&stsb=0.57&trec_coarse=0.54&trec_fine=8.67&tweet_ev_emoji=0.40&tweet_ev_emotion=0.45&tweet_ev_hate=0.18&tweet_ev_irony=-0.80&tweet_ev_offensive=-0.25&tweet_ev_sentiment=0.54&wic=-0.87&wnli=1.55&wsc=1.35&yahoo_answers=-0.32&model_name=jmassot%2Fbert-base-uncased-issues-128&base_name=bert-base-uncased) using jmassot/bert-base-uncased-issues-128 as a base model yields average score of 73.70 in comparison to 72.20 by bert-base-uncased. The model is ranked 3rd among all tested models for the bert-base-uncased architecture as of 21/12/2022 Results: | 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers | |---------------:|----------:|-----------------------:|-------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:| | 84.2007 | 89.7333 | 65.86 | 47.75 | 71.4679 | 71.4286 | 82.6462 | 59 | 78.6 | 90.34 | 79.5 | 91.432 | 69.0352 | 83.5639 | 83.3333 | 61.2005 | 67.3077 | 90.4082 | 89.7353 | 85.272 | 64.6209 | 91.9725 | 52.8054 | 86.4351 | 96.6 | 77 | 36.41 | 80.3659 | 53.0303 | 66.9643 | 85.1163 | 70.0179 | 62.3824 | 52.1127 | 63.4615 | 72 | For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)