Commit
·
24af366
1
Parent(s):
1b11449
review readme
Browse files
README.md
CHANGED
@@ -127,14 +127,28 @@ This model was adapted for historical texts and fine-tuned on the [HIPE-2022 dat
|
|
127 |
|
128 |
## Model Details
|
129 |
|
130 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
131 |
- **Languages supported:** multilingual (over 100 languages, optimized for fr, de, en)
|
132 |
-
- **
|
133 |
-
- **
|
134 |
-
|
135 |
-
|
|
|
|
|
|
|
|
|
136 |
|
137 |
-
|
138 |
|
139 |
The model was trained on the following datasets:
|
140 |
|
|
|
127 |
|
128 |
## Model Details
|
129 |
|
130 |
+
### Model Description
|
131 |
+
|
132 |
+
- **Developed by:** [Impresso team](https://impresso-project.ch/). [Impresso - Media Monitoring of the Past](https://impresso-project.ch) is an
|
133 |
+
interdisciplinary research project that aims to develop and consolidate tools for
|
134 |
+
processing and exploring large collections of media archives across modalities, time,
|
135 |
+
languages and national borders. The first project (2017-2021) was funded by the Swiss
|
136 |
+
National Science Foundation under grant
|
137 |
+
No. [CRSII5_173719](http://p3.snf.ch/project-173719) and the second project (2023-2027)
|
138 |
+
by the SNSF under grant No. [CRSII5_213585](https://data.snf.ch/grants/grant/213585)
|
139 |
+
and the Luxembourg National Research Fund under grant No. 17498891.
|
140 |
+
- **Model type:** Stacked BERT-based token classification model for named entity recognition
|
141 |
- **Languages supported:** multilingual (over 100 languages, optimized for fr, de, en)
|
142 |
+
- **License:** [GNU Affero General Public License v3 or later](https://github.com/impresso/impresso-pyindexation/blob/master/LICENSE)
|
143 |
+
- **Finetuned from model:** [dbmdz/bert-medium-historic-multilingual-cased](https://huggingface.co/dbmdz/bert-medium-historic-multilingual-cased)
|
144 |
+
|
145 |
+
### Model Architecture
|
146 |
+
|
147 |
+
- **Architecture:** mBART-based seq2seq with constrained beam search
|
148 |
+
|
149 |
+
## Training Details
|
150 |
|
151 |
+
### Training Data
|
152 |
|
153 |
The model was trained on the following datasets:
|
154 |
|