Update README.md
Browse files
README.md
CHANGED
|
@@ -82,6 +82,24 @@ We evaluate PortBERT on **ExtraGLUE**, a Portuguese adaptation of the GLUE bench
|
|
| 82 |
## Fairseq Checkpoint
|
| 83 |
Get the fairseq checkpoint [here](https://drive.proton.me/urls/K1564WDMX4#Ynyx8jgPis4R).
|
| 84 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
## 📜 License
|
| 86 |
|
| 87 |
MIT License
|
|
|
|
| 82 |
## Fairseq Checkpoint
|
| 83 |
Get the fairseq checkpoint [here](https://drive.proton.me/urls/K1564WDMX4#Ynyx8jgPis4R).
|
| 84 |
|
| 85 |
+
## Citations
|
| 86 |
+
If you use PortBERT in your research, please cite the following paper:
|
| 87 |
+
```bibtex
|
| 88 |
+
@book{scheible-schmitt-etal-2025-portbert,
|
| 89 |
+
author = {\textbf{Scheible-Schmitt}, \textbf{Raphael} and He, Henry and Mendes, Armando B.},
|
| 90 |
+
title = {PortBERT: Navigating the Depths of Portuguese Language Models},
|
| 91 |
+
booktitle = {Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models},
|
| 92 |
+
month = {September},
|
| 93 |
+
year = {2025},
|
| 94 |
+
address = {Varna, Bulgaria},
|
| 95 |
+
publisher = {INCOMA Ltd., Shoumen, BULGARIA},
|
| 96 |
+
pages = {59--71},
|
| 97 |
+
abstract = {Transformer models dominate modern NLP, but efficient, language-specific models remain scarce. In Portuguese, most focus on scale or accuracy, often neglecting training and deployment efficiency. In the present work, we introduce PortBERT, a family of RoBERTa-based language models for Portuguese, designed to balance performance and efficiency. Trained from scratch on over 450 GB of deduplicated and filtered mC4 and OSCAR23 from CulturaX using fairseq, PortBERT leverages byte-level BPE tokenization and stable pre-training routines across both GPU and TPU processors. We release two variants, PortBERT base and PortBERT large, and evaluate them on ExtraGLUE, a suite of translated GLUE and SuperGLUE tasks. Both models perform competitively, matching or surpassing existing monolingual and multilingual models. Beyond accuracy, we report training and inference times as well as fine-tuning throughput, providing practical insights into model efficiency. PortBERT thus complements prior work by addressing the underexplored dimension of compute-performance tradeoffs in Portuguese NLP. We release all models on Huggingface and provide fairseq checkpoints to support further research and applications.},
|
| 98 |
+
url = {https://aclanthology.org/2025.globalnlp-1.8},
|
| 99 |
+
doi = {https://doi.org/10.26615/978-954-452-105-9-008}
|
| 100 |
+
}
|
| 101 |
+
```
|
| 102 |
+
|
| 103 |
## 📜 License
|
| 104 |
|
| 105 |
MIT License
|