Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ language:
|
|
14 |
---
|
15 |
**Pleias-360m-Preview** is an early preview of a 360 million parameter base model trained by Pleias on Common Corpus.
|
16 |
|
17 |
-
Like all the base and specialized models from Pleias, Pleias-
|
18 |
|
19 |
## Description
|
20 |
Pleias-360m-Preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference.
|
@@ -27,6 +27,8 @@ It includes the following features, that would apply to any responsibly trained
|
|
27 |
|
28 |
Pleias-360m-Preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese.
|
29 |
|
|
|
|
|
30 |
## Recommended use
|
31 |
As a base model, Pleias-360m-Preview is only able to run continuation prompts.
|
32 |
|
@@ -34,6 +36,9 @@ Text generation is currently able to support a range of creative writing tasks i
|
|
34 |
|
35 |
Pleias-360m-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
|
36 |
|
|
|
|
|
|
|
37 |
## Training
|
38 |
Pleias-360m-Preview was fully pretrained at Jean Zay on 64 h100s for 46 hours with Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
|
39 |
|
|
|
14 |
---
|
15 |
**Pleias-360m-Preview** is an early preview of a 360 million parameter base model trained by Pleias on Common Corpus.
|
16 |
|
17 |
+
Like all the base and specialized models from Pleias, Pleias-360m-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
|
18 |
|
19 |
## Description
|
20 |
Pleias-360m-Preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference.
|
|
|
27 |
|
28 |
Pleias-360m-Preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese.
|
29 |
|
30 |
+
Given its size, Pleias-360m-Preview can run on CPU without any compression loss. We provide a first GGUF variant as part of our release.
|
31 |
+
|
32 |
## Recommended use
|
33 |
As a base model, Pleias-360m-Preview is only able to run continuation prompts.
|
34 |
|
|
|
36 |
|
37 |
Pleias-360m-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
|
38 |
|
39 |
+
## Inference example:
|
40 |
+
|
41 |
+
|
42 |
## Training
|
43 |
Pleias-360m-Preview was fully pretrained at Jean Zay on 64 h100s for 46 hours with Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
|
44 |
|