Andrianos commited on
Commit
2099d81
·
verified ·
1 Parent(s): fcd450b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -3
README.md CHANGED
@@ -46,7 +46,6 @@ print(embeddings)
46
  ```
47
 
48
 
49
-
50
  ## Evaluation Results
51
 
52
  I will add the model specific evaluation results once the instance is running again.
@@ -60,7 +59,7 @@ The model was trained with the parameters:
60
 
61
  **Loss**:
62
 
63
- `sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss` with parameters:
64
  ```
65
  {'scale': 20.0, 'similarity_fct': 'cos_sim'}
66
  ```
@@ -99,6 +98,8 @@ SentenceTransformer(
99
 
100
  #### Cheap Character Noise for OCR-Robust Multilingual Embeddings (introducing paper)
101
 
 
 
102
  ```bibtex
103
  update once available
104
  ```
@@ -113,4 +114,25 @@ update once available
113
  booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},
114
  pages={1393--1412},
115
  year={2024}
116
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  ```
47
 
48
 
 
49
  ## Evaluation Results
50
 
51
  I will add the model specific evaluation results once the instance is running again.
 
59
 
60
  **Loss**:
61
 
62
+ `sentence_transformers.losses.MultipleNegativesRankingLoss` with parameters:
63
  ```
64
  {'scale': 20.0, 'similarity_fct': 'cos_sim'}
65
  ```
 
98
 
99
  #### Cheap Character Noise for OCR-Robust Multilingual Embeddings (introducing paper)
100
 
101
+ For details on the adaptation methodology please refer to our paper (published in ACL2025 Findings). If you use our models or methodology, please cite our work.
102
+
103
  ```bibtex
104
  update once available
105
  ```
 
114
  booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},
115
  pages={1393--1412},
116
  year={2024}
117
+ }
118
+ ```
119
+
120
+ ## About Impresso
121
+
122
+ ### Impresso project
123
+
124
+ [Impresso - Media Monitoring of the Past](https://impresso-project.ch) is an interdisciplinary research project that aims to develop and consolidate tools for processing and exploring large collections of media archives across modalities, time, languages and national borders. The first project (2017-2021) was funded by the Swiss National Science Foundation under grant No. [CRSII5_173719](http://p3.snf.ch/project-173719) and the second project (2023-2027) by the SNSF under grant No. [CRSII5_213585](https://data.snf.ch/grants/grant/213585) and the Luxembourg National Research Fund under grant No. 17498891.
125
+
126
+ ### Copyright
127
+
128
+ Copyright (C) 2025 The Impresso team.
129
+
130
+ ### License
131
+
132
+ This program is provided as open source under the [GNU Affero General Public License](https://github.com/impresso/impresso-pyindexation/blob/master/LICENSE) v3 or later.
133
+
134
+ ---
135
+
136
+ <p align="center">
137
+ <img src="https://github.com/impresso/impresso.github.io/blob/master/assets/images/3x1--Yellow-Impresso-Black-on-White--transparent.png?raw=true" width="350" alt="Impresso Project Logo"/>
138
+ </p>