Further info on pretraining

by manueltonneau - opened Dec 13, 2023

Dec 13, 2023

Hi @Davlan and thank you very much for this contribution. Could you please provide more info on the pretraining dataset, especially its size? Also, could you please say a bit more about how you pretrained (is it adaptive finetuning like AfroXLMR)?

manueltonneau changed discussion title from Size of pretraining dataset to Further info on pretraining Dec 13, 2023

manueltonneau

Dec 13, 2023

Finally, is there a paper I can cite if I want to reference this model in a paper? Thank you!

Davlan

Owner Dec 13, 2023

Yes, there is a paper, please cite our MasakhaNER paper https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00416/107614/MasakhaNER-Named-Entity-Recognition-for-African

Table 10 has the information on the monolingual finetuning corpus

manueltonneau

Dec 13, 2023

Thank you :)

manueltonneau changed discussion status to closed Dec 13, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment