--- license: mit tags: - generated_from_trainer base_model: neuralmind/bert-base-portuguese-cased model-index: - name: output results: [] --- # output This model is a fine-tuned version of [neuralmind/bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.6440 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - gradient_accumulation_steps: 8 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 10000 - num_epochs: 15.0 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:------:|:---------------:| | 1.1985 | 0.22 | 2500 | 1.0940 | | 1.0937 | 0.44 | 5000 | 1.0033 | | 1.0675 | 0.66 | 7500 | 0.9753 | | 1.0565 | 0.87 | 10000 | 0.9801 | | 1.0244 | 1.09 | 12500 | 0.9526 | | 0.9943 | 1.31 | 15000 | 0.9298 | | 0.9799 | 1.53 | 17500 | 0.9035 | | 0.95 | 1.75 | 20000 | 0.8835 | | 0.933 | 1.97 | 22500 | 0.8636 | | 0.9079 | 2.18 | 25000 | 0.8507 | | 0.8938 | 2.4 | 27500 | 0.8397 | | 0.8781 | 2.62 | 30000 | 0.8195 | | 0.8647 | 2.84 | 32500 | 0.8088 | | 0.8422 | 3.06 | 35000 | 0.7954 | | 0.831 | 3.28 | 37500 | 0.7871 | | 0.8173 | 3.5 | 40000 | 0.7721 | | 0.8072 | 3.71 | 42500 | 0.7611 | | 0.8011 | 3.93 | 45000 | 0.7532 | | 0.7828 | 4.15 | 47500 | 0.7431 | | 0.7691 | 4.37 | 50000 | 0.7367 | | 0.7659 | 4.59 | 52500 | 0.7292 | | 0.7606 | 4.81 | 55000 | 0.7245 | | 0.8082 | 5.02 | 57500 | 0.7696 | | 0.8114 | 5.24 | 60000 | 0.7695 | | 0.8022 | 5.46 | 62500 | 0.7613 | | 0.7986 | 5.68 | 65000 | 0.7558 | | 0.8018 | 5.9 | 67500 | 0.7478 | | 0.782 | 6.12 | 70000 | 0.7435 | | 0.7743 | 6.34 | 72500 | 0.7367 | | 0.774 | 6.55 | 75000 | 0.7313 | | 0.7692 | 6.77 | 77500 | 0.7270 | | 0.7604 | 6.99 | 80000 | 0.7200 | | 0.7468 | 7.21 | 82500 | 0.7164 | | 0.7486 | 7.43 | 85000 | 0.7117 | | 0.7399 | 7.65 | 87500 | 0.7043 | | 0.7306 | 7.86 | 90000 | 0.6956 | | 0.7243 | 8.08 | 92500 | 0.6959 | | 0.7132 | 8.3 | 95000 | 0.6916 | | 0.71 | 8.52 | 97500 | 0.6853 | | 0.7128 | 8.74 | 100000 | 0.6855 | | 0.7088 | 8.96 | 102500 | 0.6809 | | 0.7002 | 9.18 | 105000 | 0.6784 | | 0.6953 | 9.39 | 107500 | 0.6737 | | 0.695 | 9.61 | 110000 | 0.6714 | | 0.6871 | 9.83 | 112500 | 0.6687 | | 0.7161 | 10.05 | 115000 | 0.6961 | | 0.7265 | 10.27 | 117500 | 0.7006 | | 0.7284 | 10.49 | 120000 | 0.6941 | | 0.724 | 10.7 | 122500 | 0.6887 | | 0.7266 | 10.92 | 125000 | 0.6931 | | 0.7051 | 11.14 | 127500 | 0.6846 | | 0.7106 | 11.36 | 130000 | 0.6816 | | 0.7011 | 11.58 | 132500 | 0.6830 | | 0.6997 | 11.8 | 135000 | 0.6784 | | 0.6969 | 12.02 | 137500 | 0.6734 | | 0.6968 | 12.23 | 140000 | 0.6709 | | 0.6867 | 12.45 | 142500 | 0.6656 | | 0.6925 | 12.67 | 145000 | 0.6661 | | 0.6795 | 12.89 | 147500 | 0.6606 | | 0.6774 | 13.11 | 150000 | 0.6617 | | 0.6756 | 13.33 | 152500 | 0.6563 | | 0.6728 | 13.54 | 155000 | 0.6547 | | 0.6732 | 13.76 | 157500 | 0.6520 | | 0.6704 | 13.98 | 160000 | 0.6492 | | 0.6666 | 14.2 | 162500 | 0.6446 | | 0.6615 | 14.42 | 165000 | 0.6488 | | 0.6638 | 14.64 | 167500 | 0.6523 | | 0.6588 | 14.85 | 170000 | 0.6415 | ### Framework versions - Transformers 4.12.5 - Pytorch 1.10.1+cu113 - Datasets 1.17.0 - Tokenizers 0.10.3 ## Citing & Authors If you use our work, please cite: ``` @incollection{Viegas_2023, doi = {10.1007/978-3-031-36805-9_24}, url = {https://doi.org/10.1007%2F978-3-031-36805-9_24}, year = 2023, publisher = {Springer Nature Switzerland}, pages = {349--365}, author = {Charles F. O. Viegas and Bruno C. Costa and Renato P. Ishii}, title = {{JurisBERT}: A New Approach that~Converts a~Classification Corpus into~an~{STS} One}, booktitle = {Computational Science and Its Applications {\textendash} {ICCSA} 2023} } ```