--- library_name: transformers tags: [] --- # Tetun BERT model A fine-tune of [xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) trained on Tetun data with a masked language modelling objective. Tetun data used: [MADLAD](https://huggingface.co/datasets/allenai/MADLAD-400) tet clean split (~40k documents). Trained for 10 epochs with hyper params from the [MasakhaNER paper](https://aclanthology.org/2021.tacl-1.66.pdf) (lr 5e-5 etc).