Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ license: apache-2.0
|
|
14 |
---
|
15 |
|
16 |
# Guyanese English Creole to English Translator
|
17 |
-
This model utilises T5-base pre-trained model. It was fine tuned using a custom dataset for translation of Guyanese English Creole to English. This model will be updated periodically as more data is compiled. For more on the Caribbean English
|
18 |
|
19 |
|
20 |
|
@@ -25,7 +25,7 @@ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
|
25 |
tokenizer = AutoTokenizer.from_pretrained("KES/GEC-English")
|
26 |
model = AutoModelForSeq2SeqLM.from_pretrained("KES/GEC-English")
|
27 |
text = "Ah waan ah phone"
|
28 |
-
inputs = tokenizer("
|
29 |
output = model.generate(inputs['input_ids'], num_beams=4, max_length=512, early_stopping=True)
|
30 |
translation=tokenizer.batch_decode(output, skip_special_tokens=True)
|
31 |
print("".join(translation)) #translation: I want a phone.
|
|
|
14 |
---
|
15 |
|
16 |
# Guyanese English Creole to English Translator
|
17 |
+
This model utilises T5-base pre-trained model. It was fine tuned using a custom dataset for translation of Guyanese English Creole to English. This model will be updated periodically as more data is compiled. For more on the Caribbean English Creoles checkout the library [Caribe](https://pypi.org/project/Caribe/).
|
18 |
|
19 |
|
20 |
|
|
|
25 |
tokenizer = AutoTokenizer.from_pretrained("KES/GEC-English")
|
26 |
model = AutoModelForSeq2SeqLM.from_pretrained("KES/GEC-English")
|
27 |
text = "Ah waan ah phone"
|
28 |
+
inputs = tokenizer("guy:"+text, truncation=True, return_tensors='pt')
|
29 |
output = model.generate(inputs['input_ids'], num_beams=4, max_length=512, early_stopping=True)
|
30 |
translation=tokenizer.batch_decode(output, skip_special_tokens=True)
|
31 |
print("".join(translation)) #translation: I want a phone.
|