impresso-project
/

nel-mgenre-multilingual

Text2Text Generation

entity-retrieval

named-entity-disambiguation

entity-disambiguation

named-entity-linking

Model card Files Files and versions Community

emanuelaboros commited on 8 days ago

Commit

24af366

·

1 Parent(s): 1b11449

review readme

Files changed (1) hide show

README.md +20 -6

README.md CHANGED Viewed

@@ -127,14 +127,28 @@ This model was adapted for historical texts and fine-tuned on the [HIPE-2022 dat
 ## Model Details
-- **Architecture:** mBART-based seq2seq with constrained beam search
 - **Languages supported:** multilingual (over 100 languages, optimized for fr, de, en)
-- **Training dataset:** HIPE-2022 (see below)
-- **Entity target space:** Wikidata entities
-- **Developed by:** DHLAB, EPFL
-- **License:** AGPL-3.0
-## Training Dataset
 The model was trained on the following datasets:

 ## Model Details
+### Model Description
+- **Developed by:** [Impresso team](https://impresso-project.ch/). [Impresso - Media Monitoring of the Past](https://impresso-project.ch) is an
+interdisciplinary research project that aims to develop and consolidate tools for
+processing and exploring large collections of media archives across modalities, time,
+languages and national borders. The first project (2017-2021) was funded by the Swiss
+National Science Foundation under grant
+No. [CRSII5_173719](http://p3.snf.ch/project-173719) and the second project (2023-2027)
+by the SNSF under grant No. [CRSII5_213585](https://data.snf.ch/grants/grant/213585)
+and the Luxembourg National Research Fund under grant No. 17498891.
+- **Model type:** Stacked BERT-based token classification model for named entity recognition
 - **Languages supported:** multilingual (over 100 languages, optimized for fr, de, en)
+- **License:** [GNU Affero General Public License v3 or later](https://github.com/impresso/impresso-pyindexation/blob/master/LICENSE)
+- **Finetuned from model:** [dbmdz/bert-medium-historic-multilingual-cased](https://huggingface.co/dbmdz/bert-medium-historic-multilingual-cased)
+### Model Architecture
+- **Architecture:** mBART-based seq2seq with constrained beam search
+## Training Details
+### Training Data
 The model was trained on the following datasets: