nicholasKluge commited on
Commit
4fdce38
·
verified ·
1 Parent(s): 1966e7b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +174 -0
README.md ADDED
@@ -0,0 +1,174 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - ruanchaves/faquad-nli
5
+ language:
6
+ - pt
7
+ metrics:
8
+ - accuracy
9
+ library_name: transformers
10
+ pipeline_tag: text-classification
11
+ tags:
12
+ - textual-entailment
13
+ widget:
14
+ - text: "<s>Qual a capital do Brasil?<s>A capital do Brasil é Brasília!</s>"
15
+ example_title: Exemplo
16
+ - text: "<s>Qual a capital do Brasil?<s>Anões são muito mais legais do que elfos!</s>"
17
+ example_title: Exemplo
18
+ ---
19
+ # TeenyTinyLlama-160m-FaQuAD-NLI
20
+
21
+ TeenyTinyLlama is a series of small foundational models trained in Brazilian Portuguese.
22
+
23
+ This repository contains a version of [TeenyTinyLlama-460m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) (`TeenyTinyLlama-160m-FaQuAD-NLI`) fine-tuned on the [FaQuAD-NLI dataset](https://huggingface.co/datasets/ruanchaves/faquad-nli).
24
+
25
+ ## Details
26
+
27
+ - **Number of Epochs:** 3
28
+ - **Batch size:** 16
29
+ - **Optimizer:** `torch.optim.AdamW` (learning_rate = 4e-5, epsilon = 1e-8)
30
+ - **GPU:** 1 NVIDIA A100-SXM4-40GB
31
+
32
+ ## Usage
33
+
34
+ Using `transformers.pipeline`:
35
+
36
+ ```python
37
+ from transformers import pipeline
38
+
39
+ text = "<s>Qual a capital do Brasil?<s>A capital do Brasil é Brasília!</s>"
40
+
41
+ classifier = pipeline("text-classification", model="nicholasKluge/TeenyTinyLlama-460m-FaQuAD-NLI")
42
+ classifier(text)
43
+
44
+ # >>> [{'label': 'SUITABLE', 'score': 0.9774010181427002}]
45
+ ```
46
+
47
+ ## Reproducing
48
+
49
+ To reproduce the fine-tuning process, use the following code snippet:
50
+
51
+ ```python
52
+ # Faquad-nli
53
+ ! pip install transformers datasets evaluate accelerate -q
54
+
55
+ import evaluate
56
+ import numpy as np
57
+ from datasets import load_dataset, Dataset, DatasetDict
58
+ from transformers import AutoTokenizer, DataCollatorWithPadding
59
+ from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer
60
+
61
+ # Load the task
62
+ dataset = load_dataset("ruanchaves/faquad-nli")
63
+
64
+ # Create a `ModelForSequenceClassification`
65
+ model = AutoModelForSequenceClassification.from_pretrained(
66
+ "nicholasKluge/TeenyTinyLlama-460m",
67
+ num_labels=2,
68
+ id2label={0: "UNSUITABLE", 1: "SUITABLE"},
69
+ label2id={"UNSUITABLE": 0, "SUITABLE": 1}
70
+ )
71
+
72
+ tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/TeenyTinyLlama-460m")
73
+
74
+ # Format the dataset
75
+ train = dataset['train'].to_pandas()
76
+ train['text'] = train['question'] + tokenizer.bos_token + train['answer'] + tokenizer.eos_token
77
+ train = train[['text', 'label']]
78
+ train.labels = train.label.astype(int)
79
+ train = Dataset.from_pandas(train)
80
+
81
+ test = dataset['test'].to_pandas()
82
+ test['text'] = test['question'] + tokenizer.bos_token + test['answer'] + tokenizer.eos_token
83
+ test = test[['text', 'label']]
84
+ test.labels = test.label.astype(int)
85
+ test = Dataset.from_pandas(test)
86
+
87
+ dataset = DatasetDict({
88
+ "train": train,
89
+ "test": test
90
+ })
91
+
92
+ # Preprocess the dataset
93
+ def preprocess_function(examples):
94
+ return tokenizer(examples["text"], truncation=True)
95
+
96
+ dataset_tokenized = dataset.map(preprocess_function, batched=True)
97
+
98
+ # Create a simple data collactor
99
+ data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
100
+
101
+ # Use accuracy as evaluation metric
102
+ accuracy = evaluate.load("accuracy")
103
+
104
+ # Function to compute accuracy
105
+ def compute_metrics(eval_pred):
106
+ predictions, labels = eval_pred
107
+ predictions = np.argmax(predictions, axis=1)
108
+ return accuracy.compute(predictions=predictions, references=labels)
109
+
110
+ # Define training arguments
111
+ training_args = TrainingArguments(
112
+ output_dir="checkpoints",
113
+ learning_rate=4e-5,
114
+ per_device_train_batch_size=16,
115
+ per_device_eval_batch_size=16,
116
+ num_train_epochs=3,
117
+ weight_decay=0.01,
118
+ evaluation_strategy="epoch",
119
+ save_strategy="epoch",
120
+ load_best_model_at_end=True,
121
+ push_to_hub=True,
122
+ hub_token="your_token_here",
123
+ hub_model_id="username/model-ID"
124
+ )
125
+
126
+ # Define the Trainer
127
+ trainer = Trainer(
128
+ model=model,
129
+ args=training_args,
130
+ train_dataset=dataset_tokenized["train"],
131
+ eval_dataset=dataset_tokenized["test"],
132
+ tokenizer=tokenizer,
133
+ data_collator=data_collator,
134
+ compute_metrics=compute_metrics,
135
+ )
136
+
137
+ # Train!
138
+ trainer.train()
139
+
140
+ ```
141
+
142
+ ## Fine-Tuning Comparisons
143
+
144
+ | Models | [FaQuAD-NLI](https://huggingface.co/datasets/ruanchaves/faquad-nli) |
145
+ |--------------------------------------------------------------------------------------------|---------------------------------------------------------------------|
146
+ | [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 93.07 |
147
+ | [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 92.26 |
148
+ | [Teeny Tiny Llama 460m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m) | 91.18 |
149
+ | [Teeny Tiny Llama 160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) | 90.00 |
150
+ | [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 86.46 |
151
+
152
+ ## Cite as 🤗
153
+
154
+ ```latex
155
+
156
+ @misc{nicholas22llama,
157
+ doi = {10.5281/zenodo.6989727},
158
+ url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m},
159
+ author = {Nicholas Kluge Corrêa},
160
+ title = {TeenyTinyLlama},
161
+ year = {2023},
162
+ publisher = {HuggingFace},
163
+ journal = {HuggingFace repository},
164
+ }
165
+
166
+ ```
167
+
168
+ ## Funding
169
+
170
+ This repository was built as part of the RAIES ([Rede de Inteligência Artificial Ética e Segura](https://www.raies.org/)) initiative, a project supported by FAPERGS - ([Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul](https://fapergs.rs.gov.br/inicial)), Brazil.
171
+
172
+ ## License
173
+
174
+ TeenyTinyLlama-460m-FaQuAD-NLI is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.