Safetensors
llama
catherinearnett commited on
Commit
da7cf60
·
verified ·
1 Parent(s): 337b550

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -13
README.md CHANGED
@@ -12,14 +12,13 @@ language:
12
  - nl
13
  - pl
14
  ---
15
- **
16
- Pleias-pico-350m-preview** is an early preview of a 350 million parameter base model trained by Pleias on Common Corpus.
17
 
18
- Like all the base and specialized models from Pleias, Pleias-350m-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
19
 
20
  ## Description
21
 
22
- Pleias-pico-350m-preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference.
23
 
24
  It includes the following features, that would apply to any responsibly trained variant:
25
  * Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content.
@@ -28,21 +27,21 @@ It includes the following features, that would apply to any responsibly trained
28
  * Extremely low level of toxicity and problematic content.
29
 
30
 
31
- Pleias-pico-350m-preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese.
32
 
33
- Given its size, Pleias-pico-350m-preview can run on CPU without any compression loss. We provide a first GGUF variant as part of our release.
34
 
35
  ## Recommended use
36
- As a base model, Pleias-360m-Preview is only able to run continuation prompts.
37
 
38
  Text generation is currently able to support a range of creative writing tasks in multiple European languages. For more consistent results we recommend using a low or null temperature with a slight repetition penalty (1.1-1.2).
39
 
40
- Pleias-360m-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
41
 
42
  ## Examples
43
- With only 360 million parameters, Pleias-360m-Preview is able to generate consistent texts in multiple European languages.
44
 
45
- Thanks to an unprecedented coverage of ancient texts in public domain, Pleias-360m-Preview is even able to generate texts in historical script. Such as a continuation of <i>To be or not to be</i> in original spelling:
46
 
47
  <blockquote>
48
  <b>To be, or not to be, that is the Queſtion:<br/>
@@ -76,18 +75,18 @@ Car il eſt certain que les Princes font moins heureux;<br/>
76
  Que ceux qui ont eu leur vie par leurs mains confacrée.<br/>
77
  </blockquote>
78
 
79
- Pleias-360m is able to interspesed languages and has good capacities for translation, even in classical languages like Latin:
80
 
81
  <blockquote>
82
  <b>Cicero advocated for a similar idea: Equidem ad</b> pacem, non ut in pace sit; sed ut in pace. Cicero's statement is echoed by other authors of the time, such as Quintilian and Tacitus, who also emphasized the importance of peace over war (Quint., Inst., 1, 4). In contrast to the Roman concept of "pax" or "pacificus", which was often associated with military conquests, Cicero saw peace as an ideal state that could be achieved through peaceful means.
83
  </blockquote>
84
 
85
  ## Training
86
- Pleias-360m-Preview was fully pretrained at Jean Zay on 64 h100s for 46 hours with Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
87
 
88
  Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).
89
 
90
  ## Update
91
- Pleias-360m-Preview is currently released as an early preview.
92
 
93
  The model will undergo several more round of post-training to enhance reasoning capacities and fine-tunability as well as in anticipation of a generalist instruct version.
 
12
  - nl
13
  - pl
14
  ---
15
+ **Pleias-pico-350m-Preview** is an early preview of a 350 million parameter base model trained by [Pleias](https://huggingface.co/PleIAs) on [Common Corpus](https://huggingface.co/datasets/PleIAs/common_corpus).
 
16
 
17
+ Like all the base and specialized models from Pleias, Pleias-pico-350m-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
18
 
19
  ## Description
20
 
21
+ Pleias-pico-350m-Preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference.
22
 
23
  It includes the following features, that would apply to any responsibly trained variant:
24
  * Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content.
 
27
  * Extremely low level of toxicity and problematic content.
28
 
29
 
30
+ Pleias-pico-350m-Preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese.
31
 
32
+ Given its size, Pleias-pico-350m-Preview can run on CPU without any compression loss. We provide a first GGUF variant as part of our release.
33
 
34
  ## Recommended use
35
+ As a base model, Pleias-pico-350m-Preview is only able to run continuation prompts.
36
 
37
  Text generation is currently able to support a range of creative writing tasks in multiple European languages. For more consistent results we recommend using a low or null temperature with a slight repetition penalty (1.1-1.2).
38
 
39
+ Pleias-pico-350m-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
40
 
41
  ## Examples
42
+ With only 350 million parameters, Pleias-pico-350m-Preview is able to generate consistent texts in multiple European languages.
43
 
44
+ Thanks to an unprecedented coverage of ancient texts in public domain, Pleias-pico-350m-Preview is even able to generate texts in historical script. Such as a continuation of <i>To be or not to be</i> in original spelling:
45
 
46
  <blockquote>
47
  <b>To be, or not to be, that is the Queſtion:<br/>
 
75
  Que ceux qui ont eu leur vie par leurs mains confacrée.<br/>
76
  </blockquote>
77
 
78
+ Pleias-pico-350m-Preview is able to intersperse languages and has good capacities for translation, even in classical languages like Latin:
79
 
80
  <blockquote>
81
  <b>Cicero advocated for a similar idea: Equidem ad</b> pacem, non ut in pace sit; sed ut in pace. Cicero's statement is echoed by other authors of the time, such as Quintilian and Tacitus, who also emphasized the importance of peace over war (Quint., Inst., 1, 4). In contrast to the Roman concept of "pax" or "pacificus", which was often associated with military conquests, Cicero saw peace as an ideal state that could be achieved through peaceful means.
82
  </blockquote>
83
 
84
  ## Training
85
+ Pleias-pico-350m-Preview was fully pretrained at Jean Zay on 64 h100s for 46 hours with Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
86
 
87
  Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).
88
 
89
  ## Update
90
+ Pleias-pico-350m-Preview is currently released as an early preview.
91
 
92
  The model will undergo several more round of post-training to enhance reasoning capacities and fine-tunability as well as in anticipation of a generalist instruct version.