nicholasKluge commited on
Commit
be8a5d3
·
verified ·
1 Parent(s): f1487de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -19,4 +19,6 @@ thumbnail: >-
19
  <img src="./logo.png" alt="An illustration of a Tucano bird showing vibrant colors like yellow, orange, blue, green, and black." height="400">
20
  </p>
21
 
22
- To stimulate the future of open development of neural text generation in Portuguese, we present both **[GigaVerbo](https://huggingface.co/datasets/TucanoBR/GigaVerbo)**, a concatenation of deduplicated Portuguese text corpora amounting to 200 billion tokens, and **[Tucano](https://huggingface.co/TucanoBR/Tucano-2b4)**, a series of decoder-transformers natively pre-trained in Portuguese. All byproducts of our study, including the source code used for training and evaluation, are openly released on [GitHub](https://github.com/Nkluge-correa/Tucano) and Hugging Face.
 
 
 
19
  <img src="./logo.png" alt="An illustration of a Tucano bird showing vibrant colors like yellow, orange, blue, green, and black." height="400">
20
  </p>
21
 
22
+ To stimulate the future of open development of neural text generation in Portuguese, we present both **[GigaVerbo](https://huggingface.co/datasets/TucanoBR/GigaVerbo)**, a concatenation of deduplicated Portuguese text corpora amounting to 200 billion tokens, and **[Tucano](https://huggingface.co/TucanoBR/Tucano-2b4)**, a series of decoder-transformers natively pre-trained in Portuguese. All byproducts of our study, including the source code used for training and evaluation, are openly released on [GitHub](https://github.com/Nkluge-correa/Tucano) and Hugging Face.
23
+
24
+ Read our preprint in [arXiv](https://arxiv.org/abs/2411.07854).