--- language: - ru tags: - sentiment analysis - Russian --- ## MBARTRuSumGazeta-ru-sentiment-RuSentiment MBARTRuSumGazeta-ru-sentiment-RuSentiment is a [MBARTRuSumGazeta](https://huggingface.co/IlyaGusev/mbart_ru_sum_gazeta) model fine-tuned on [RuSentiment dataset](https://github.com/text-machine-lab/rusentiment) of general-domain Russian-language posts from the largest Russian social network, VKontakte.
Model Score
Rank Dataset
SentiRuEval-2016
RuSentiment KRND LINIS Crowd RuTweetCorp RuReviews
TC Banks
micro F1 macro F1 F1 micro F1 macro F1 F1 wighted F1 F1 F1 F1 F1
SOTA n/s 76.71 66.40 70.68 67.51 69.53 74.06 78.50 n/s 73.63 60.51 83.68 77.44
XLM-RoBERTa-Large 76.37 1 82.26 76.36 79.42 76.35 76.08 80.89 78.31 75.27 75.17 60.03 88.91 78.81
SBERT-Large 75.43 2 78.40 71.36 75.14 72.39 71.87 77.72 78.58 75.85 74.20 60.64 88.66 77.41
MBARTRuSumGazeta 74.70 3 76.06 68.95 73.04 72.34 71.93 77.83 76.71 73.56 74.18 60.54 87.22 77.51
Conversational RuBERT 74.44 4 76.69 69.09 73.11 69.44 68.68 75.56 77.31 74.40 73.10 59.95 87.86 77.78
LaBSE 74.11 5 77.00 69.19 73.55 70.34 69.83 76.38 74.94 70.84 73.20 59.52 87.89 78.47
XLM-RoBERTa-Base 73.60 6 76.35 69.37 73.42 68.45 67.45 74.05 74.26 70.44 71.40 60.19 87.90 78.28
RuBERT 73.45 7 74.03 66.14 70.75 66.46 66.40 73.37 75.49 71.86 72.15 60.55 86.99 77.41
MBART-50-Large-Many-to-Many 73.15 8 75.38 67.81 72.26 67.13 66.97 73.85 74.78 70.98 71.98 59.20 87.05 77.24
SlavicBERT 71.96 9 71.45 63.03 68.44 64.32 63.99 71.31 72.13 67.57 72.54 58.70 86.43 77.16
EnRuDR-BERT 71.51 10 72.56 64.74 69.07 61.44 60.21 68.34 74.19 69.94 69.33 56.55 87.12 77.95
RuDR-BERT 71.14 11 72.79 64.23 68.36 61.86 60.92 68.48 74.65 70.63 68.74 54.45 87.04 77.91
MBART-50-Large 69.46 12 70.91 62.67 67.24 61.12 60.25 68.41 72.88 68.63 70.52 46.39 86.48 77.52
The table shows per-task scores and a macro-average of those scores to determine a models’s position on the leaderboard. For datasets with multiple evaluation metrics (e.g., macro F1 and weighted F1 for RuSentiment), we use an unweighted average of the metrics as the score for the task when computing the overall macro-average. The same strategy for comparing models’ results was applied in the GLUE benchmark. ## Citation If you find this repository helpful, feel free to cite our publication: ``` @article{Smetanin2021Deep, author = {Sergey Smetanin and Mikhail Komarov}, title = {Deep transfer learning baselines for sentiment analysis in Russian}, journal = {Information Processing & Management}, volume = {58}, number = {3}, pages = {102484}, year = {2021}, issn = {0306-4573}, doi = {0.1016/j.ipm.2020.102484} } ``` Dataset: ``` @inproceedings{rogers2018rusentiment, title={RuSentiment: An enriched sentiment analysis dataset for social media in Russian}, author={Rogers, Anna and Romanov, Alexey and Rumshisky, Anna and Volkova, Svitlana and Gronas, Mikhail and Gribov, Alex}, booktitle={Proceedings of the 27th international conference on computational linguistics}, pages={755--763}, year={2018} } ```