nicholasKluge
/

TeenyTinyLlama-460m-Assin2

@@ -16,11 +16,11 @@ widget:
 - text: "<s>Uma mulher está misturando ovos.<s>A mulher está bebendo.</s>"
   example_title: Exemplo
 ---
-# TeenyTinyLlama-460m-Assin2
 TeenyTinyLlama is a pair of small foundational models trained in Brazilian Portuguese.
-This repository contains a version of [TeenyTinyLlama-460m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m) (`TeenyTinyLlama-460m-Assin2`) fine-tuned on the [Assin2](https://huggingface.co/datasets/assin2).
 ## Details
@@ -38,7 +38,7 @@ from transformers import pipeline
 text = "<s>Qual a capital do Brasil?<s>A capital do Brasil é Brasília!</s>"
-classifier = pipeline("text-classification", model="nicholasKluge/TeenyTinyLlama-460m-Assin2")
 classifier(text)
 # >>> [{'label': 'ENTAILED', 'score': 0.9392824769020081}]
@@ -63,13 +63,13 @@ dataset = load_dataset("assin2")
 # Create a `ModelForSequenceClassification`
 model = AutoModelForSequenceClassification.from_pretrained(
-    "nicholasKluge/TeenyTinyLlama-460m",
     num_labels=2,
     id2label={0: "UNENTAILED", 1: "ENTAILED"},
     label2id={"UNENTAILED": 0, "ENTAILED": 1}
 )
-tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/TeenyTinyLlama-460m")
 # Format the dataset
 train = dataset['train'].to_pandas()
@@ -158,7 +158,6 @@ All the shown results are the higher accuracy scores achieved on the respective
 ## Cite as 🤗
 ```latex
 @misc{correa24ttllama,
   title = {TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese},
   author = {Corr{\^e}a, Nicholas Kluge and Falk, Sophia and Fatimah, Shiza and Sen, Aniket and De Oliveira, Nythamar},
@@ -166,6 +165,15 @@ All the shown results are the higher accuracy scores achieved on the respective
   year={2024}
 }
 ```
 ## Funding
@@ -174,4 +182,4 @@ This repository was built as part of the RAIES ([Rede de Inteligência Artificia
 ## License
-TeenyTinyLlama-460m-Assin2 is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.

 - text: "<s>Uma mulher está misturando ovos.<s>A mulher está bebendo.</s>"
   example_title: Exemplo
 ---
+# TeenyTinyLlama-160m-Assin2
 TeenyTinyLlama is a pair of small foundational models trained in Brazilian Portuguese.
+This repository contains a version of [TeenyTinyLlama-160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) (`TeenyTinyLlama-160m-Assin2`) fine-tuned on the [Assin2](https://huggingface.co/datasets/assin2).
 ## Details
 text = "<s>Qual a capital do Brasil?<s>A capital do Brasil é Brasília!</s>"
+classifier = pipeline("text-classification", model="nicholasKluge/TeenyTinyLlama-160m-Assin2")
 classifier(text)
 # >>> [{'label': 'ENTAILED', 'score': 0.9392824769020081}]
 # Create a `ModelForSequenceClassification`
 model = AutoModelForSequenceClassification.from_pretrained(
+    "nicholasKluge/TeenyTinyLlama-160m",
     num_labels=2,
     id2label={0: "UNENTAILED", 1: "ENTAILED"},
     label2id={"UNENTAILED": 0, "ENTAILED": 1}
 )
+tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/TeenyTinyLlama-160m")
 # Format the dataset
 train = dataset['train'].to_pandas()
 ## Cite as 🤗
 ```latex
 @misc{correa24ttllama,
   title = {TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese},
   author = {Corr{\^e}a, Nicholas Kluge and Falk, Sophia and Fatimah, Shiza and Sen, Aniket and De Oliveira, Nythamar},
   year={2024}
 }
+@misc{correa24ttllama,
+  doi = {10.1016/j.mlwa.2024.100558},
+  url = {https://www.sciencedirect.com/science/article/pii/S2666827024000343},
+  title = {TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese},
+  author = {Corr{\^e}a, Nicholas Kluge and Falk, Sophia and Fatimah, Shiza and Sen, Aniket and De Oliveira, Nythamar},
+  journal={Machine Learning With Applications},
+  publisher = {Springer},
+  year={2024}
+}
 ```
 ## Funding
 ## License
+TeenyTinyLlama-160m-Assin2 is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.