Update README.md
Browse files
README.md
CHANGED
@@ -38,4 +38,10 @@ Plume is the first LLM trained for Neural Machine Translation with only parallel
|
|
38 |
|
39 |
- **Developed by:** Machine Translation Unit at the Barcelona Supercomputing Center (BSC).
|
40 |
- **Languages:** Spanish, French, Italian, Portuguese, Galician, German, English, and Basque.
|
41 |
-
- **License:** Apache License, Version 2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
|
39 |
- **Developed by:** Machine Translation Unit at the Barcelona Supercomputing Center (BSC).
|
40 |
- **Languages:** Spanish, French, Italian, Portuguese, Galician, German, English, and Basque.
|
41 |
+
- **License:** Apache License, Version 2.0
|
42 |
+
|
43 |
+
## Model Description
|
44 |
+
|
45 |
+
In recent years, Large Language Models (LLMs) have demonstrated exceptional proficiency across a broad spectrum of Natural Language Processing (NLP) tasks, including Machine Translation. However, previous methodologies predominantly relied on iterative processes such as instruction fine-tuning or continual pre-training, leaving unexplored the challenges of training LLMs solely on parallel data. In this work, we introduce Plume (**P**arallel **L**ang**u**age **M**od**e**l), a collection of three 2B LLMs featuring varying vocabulary sizes (32k, 128k, and 256k) trained exclusively on Catalan-centric parallel examples. These models perform comparable to previous encoder-decoder architectures on 16 supervised translation directions and 56 zero-shot ones.
|
46 |
+
|
47 |
+
For more details regarding the model architecture take a look at the paper which is available on [arXiv]().
|