clampert commited on
Commit
8091117
1 Parent(s): 4f0016a

Model card

Browse files
Files changed (1) hide show
  1. README.md +39 -1
README.md CHANGED
@@ -1 +1,39 @@
1
- ## Multi-lingual sentiment prediction trained from COVID19-reated tweets
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: sentiment-analysis
3
+ language: multilingual
4
+ license: apache-2.0
5
+ tags:
6
+ - "sentiment-analysis"
7
+ - "multilingual"
8
+ ---
9
+
10
+ # Multi-lingual sentiment prediction trained from COVID19-related tweets
11
+
12
+ Repository: [https://github.com/clampert/multilingual-sentiment-analysis/](https://github.com/clampert/multilingual-sentiment-analysis/)
13
+
14
+ Model trained on a large-scale (18437530 examples) dataset of
15
+ multi-lingual tweets that was collected between March 2020
16
+ and November 2021 using Twitter’s Streaming API with varying
17
+ COVID19-related keywords. Labels were auto-general based on
18
+ the presence of positive and negative emoticons. For details
19
+ on the dataset, see our IEEE BigData 2021 publication.
20
+
21
+ Base model is [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual).
22
+ It was finetuned for sequence classification with `positive`
23
+ and `negative` labels for two epochs (48 hours on 8xP100 GPUs).
24
+
25
+ ## Citation
26
+
27
+ If you use our model your work, please cite:
28
+
29
+ ```
30
+ @inproceedings{lampert2021overcoming,
31
+ title={Overcoming Rare-Language Discrimination in Multi-Lingual Sentiment Analysis},
32
+ author={Jasmin Lampert and Christoph H. Lampert},
33
+ booktitle={IEEE International Conference on Big Data (BigData)},
34
+ year={2021},
35
+ note={Special Session: Machine Learning on Big Data},
36
+ }
37
+ ```
38
+
39
+ Enjoy!