rubert-tiny-lite / README.md

Update README.md

9c3bb58 verified 2 months ago

8.17 kB

	---
	language:
	- ru

	pipeline_tag: sentence-similarity

	tags:
	- russian
	- pretraining
	- embeddings
	- tiny
	- feature-extraction
	- sentence-similarity
	- sentence-transformers
	- transformers
	- mteb

	datasets:
	- IlyaGusev/gazeta
	- zloelias/lenta-ru
	- HuggingFaceFW/fineweb-2

	license: mit


	---


	Быстрая модель BERT для русского языка с размером ембеддинга 256 и длиной контекста 512. Модель получена методом последовательной дистилляции моделей [sergeyzh/rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) и [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3). Выигрывает по скорости у [rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) при аналогичном качестве на CPU в ~x1.4, на GPU в ~x1.2 раза.



	## Использование
	```Python
	from sentence_transformers import SentenceTransformer

	model = SentenceTransformer('sergeyzh/rubert-tiny-lite')

	sentences = ["привет мир", "hello world", "здравствуй вселенная"]
	embeddings = model.encode(sentences)

	print(model.similarity(embeddings, embeddings))
	```

	## Метрики

	Оценки модели на бенчмарке [encodechka](https://github.com/avidale/encodechka):

	\| model \| STS \| PI \| NLI \| SA \| TI \|
	\|:-----------------------------------\|:---------\|:---------\|:---------\|:---------\|:---------\|
	\| BAAI/bge-m3 \| 0.864 \| 0.749 \| 0.510 \| 0.819 \| 0.973 \|
	\| intfloat/multilingual-e5-large \| 0.862 \| 0.727 \| 0.473 \| 0.810 \| 0.979 \|
	\| sergeyzh/rubert-tiny-lite \| 0.839 \| 0.712 \| 0.488 \| 0.788 \| 0.949 \|
	\| intfloat/multilingual-e5-base \| 0.835 \| 0.704 \| 0.459 \| 0.796 \| 0.964 \|
	\| [sergeyzh/rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) \| 0.828 \| 0.722 \| 0.476 \| 0.787 \| 0.955 \|
	\| intfloat/multilingual-e5-small \| 0.822 \| 0.714 \| 0.457 \| 0.758 \| 0.957 \|
	\| cointegrated/rubert-tiny2 \| 0.750 \| 0.651 \| 0.417 \| 0.737 \| 0.937 \|

	Оценки модели на бенчмарке [ruMTEB](https://habr.com/ru/companies/sberdevices/articles/831150/):

	\|Model Name \| Metric \| rubert-tiny2 \| [rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) \| rubert-tiny-lite \| multilingual-e5-small \| multilingual-e5-base \| multilingual-e5-large \|
	\|:----------------------------------\|:--------------------\|----------------:\|------------------:\|------------------:\|----------------------:\|---------------------:\|----------------------:\|
	\|CEDRClassification \| Accuracy \| 0.369 \| 0.390 \| 0.407 \| 0.401 \| 0.423 \| 0.448 \|
	\|GeoreviewClassification \| Accuracy \| 0.396 \| 0.414 \| 0.423 \| 0.447 \| 0.461 \| 0.497 \|
	\|GeoreviewClusteringP2P \| V-measure \| 0.442 \| 0.597 \| 0.611 \| 0.586 \| 0.545 \| 0.605 \|
	\|HeadlineClassification \| Accuracy \| 0.742 \| 0.686 \| 0.652 \| 0.732 \| 0.757 \| 0.758 \|
	\|InappropriatenessClassification \| Accuracy \| 0.586 \| 0.591 \| 0.588 \| 0.592 \| 0.588 \| 0.616 \|
	\|KinopoiskClassification \| Accuracy \| 0.491 \| 0.505 \| 0.507 \| 0.500 \| 0.509 \| 0.566 \|
	\|RiaNewsRetrieval \| NDCG@10 \| 0.140 \| 0.513 \| 0.617 \| 0.700 \| 0.702 \| 0.807 \|
	\|RuBQReranking \| MAP@10 \| 0.461 \| 0.622 \| 0.631 \| 0.715 \| 0.720 \| 0.756 \|
	\|RuBQRetrieval \| NDCG@10 \| 0.109 \| 0.517 \| 0.511 \| 0.685 \| 0.696 \| 0.741 \|
	\|RuReviewsClassification \| Accuracy \| 0.570 \| 0.607 \| 0.615 \| 0.612 \| 0.630 \| 0.653 \|
	\|RuSTSBenchmarkSTS \| Pearson correlation \| 0.694 \| 0.787 \| 0.799 \| 0.781 \| 0.796 \| 0.831 \|
	\|RuSciBenchGRNTIClassification \| Accuracy \| 0.456 \| 0.529 \| 0.544 \| 0.550 \| 0.563 \| 0.582 \|
	\|RuSciBenchGRNTIClusteringP2P \| V-measure \| 0.414 \| 0.481 \| 0.510 \| 0.511 \| 0.516 \| 0.520 \|
	\|RuSciBenchOECDClassification \| Accuracy \| 0.355 \| 0.415 \| 0.424 \| 0.427 \| 0.423 \| 0.445 \|
	\|RuSciBenchOECDClusteringP2P \| V-measure \| 0.381 \| 0.411 \| 0.438 \| 0.443 \| 0.448 \| 0.450 \|
	\|SensitiveTopicsClassification \| Accuracy \| 0.220 \| 0.244 \| 0.282 \| 0.228 \| 0.234 \| 0.257 \|
	\|TERRaClassification \| Average Precision \| 0.519 \| 0.563 \| 0.574 \| 0.551 \| 0.550 \| 0.584 \|

	\|Model Name \| Metric \| rubert-tiny2 \| [rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) \| rubert-tiny-lite \| multilingual-e5-small \| multilingual-e5-base \| multilingual-e5-large \|
	\|:----------------------------------\|:--------------------\|----------------:\|------------------:\|------------------:\|----------------------:\|----------------------:\|---------------------:\|
	\|Classification \| Accuracy \| 0.514 \| 0.535 \| 0.536 \| 0.551 \| 0.561 \| 0.588 \|
	\|Clustering \| V-measure \| 0.412 \| 0.496 \| 0.520 \| 0.513 \| 0.503 \| 0.525 \|
	\|MultiLabelClassification \| Accuracy \| 0.294 \| 0.317 \| 0.344 \| 0.314 \| 0.329 \| 0.353 \|
	\|PairClassification \| Average Precision \| 0.519 \| 0.563 \| 0.574 \| 0.551 \| 0.550 \| 0.584 \|
	\|Reranking \| MAP@10 \| 0.461 \| 0.622 \| 0.631 \| 0.715 \| 0.720 \| 0.756 \|
	\|Retrieval \| NDCG@10 \| 0.124 \| 0.515 \| 0.564 \| 0.697 \| 0.699 \| 0.774 \|
	\|STS \| Pearson correlation \| 0.694 \| 0.787 \| 0.799 \| 0.781 \| 0.796 \| 0.831 \|
	\|Average \| Average \| 0.431 \| 0.548 \| 0.567 \| 0.588 \| 0.594 \| 0.630 \|

	---
	language:
	- ru

	pipeline_tag: sentence-similarity

	tags:
	- russian
	- pretraining
	- embeddings
	- tiny
	- feature-extraction
	- sentence-similarity
	- sentence-transformers
	- transformers
	- mteb

	datasets:
	- IlyaGusev/gazeta
	- zloelias/lenta-ru
	- HuggingFaceFW/fineweb-2

	license: mit


	---


	Быстрая модель BERT для русского языка с размером ембеддинга 256 и длиной контекста 512. Модель получена методом последовательной дистилляции моделей [sergeyzh/rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) и [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3). Выигрывает по скорости у [rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) при аналогичном качестве на CPU в ~x1.4, на GPU в ~x1.2 раза.



	## Использование
	```Python
	from sentence_transformers import SentenceTransformer

	model = SentenceTransformer('sergeyzh/rubert-tiny-lite')

	sentences = ["привет мир", "hello world", "здравствуй вселенная"]
	embeddings = model.encode(sentences)

	print(model.similarity(embeddings, embeddings))
	```

	## Метрики

	Оценки модели на бенчмарке [encodechka](https://github.com/avidale/encodechka):

	\| model \| STS \| PI \| NLI \| SA \| TI \|
	\|:-----------------------------------\|:---------\|:---------\|:---------\|:---------\|:---------\|
	\| BAAI/bge-m3 \| 0.864 \| 0.749 \| 0.510 \| 0.819 \| 0.973 \|
	\| intfloat/multilingual-e5-large \| 0.862 \| 0.727 \| 0.473 \| 0.810 \| 0.979 \|
	\| sergeyzh/rubert-tiny-lite \| 0.839 \| 0.712 \| 0.488 \| 0.788 \| 0.949 \|
	\| intfloat/multilingual-e5-base \| 0.835 \| 0.704 \| 0.459 \| 0.796 \| 0.964 \|
	\| [sergeyzh/rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) \| 0.828 \| 0.722 \| 0.476 \| 0.787 \| 0.955 \|
	\| intfloat/multilingual-e5-small \| 0.822 \| 0.714 \| 0.457 \| 0.758 \| 0.957 \|
	\| cointegrated/rubert-tiny2 \| 0.750 \| 0.651 \| 0.417 \| 0.737 \| 0.937 \|

	Оценки модели на бенчмарке [ruMTEB](https://habr.com/ru/companies/sberdevices/articles/831150/):

	\|Model Name \| Metric \| rubert-tiny2 \| [rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) \| rubert-tiny-lite \| multilingual-e5-small \| multilingual-e5-base \| multilingual-e5-large \|
	\|:----------------------------------\|:--------------------\|----------------:\|------------------:\|------------------:\|----------------------:\|---------------------:\|----------------------:\|
	\|CEDRClassification \| Accuracy \| 0.369 \| 0.390 \| 0.407 \| 0.401 \| 0.423 \| 0.448 \|
	\|GeoreviewClassification \| Accuracy \| 0.396 \| 0.414 \| 0.423 \| 0.447 \| 0.461 \| 0.497 \|
	\|GeoreviewClusteringP2P \| V-measure \| 0.442 \| 0.597 \| 0.611 \| 0.586 \| 0.545 \| 0.605 \|
	\|HeadlineClassification \| Accuracy \| 0.742 \| 0.686 \| 0.652 \| 0.732 \| 0.757 \| 0.758 \|
	\|InappropriatenessClassification \| Accuracy \| 0.586 \| 0.591 \| 0.588 \| 0.592 \| 0.588 \| 0.616 \|
	\|KinopoiskClassification \| Accuracy \| 0.491 \| 0.505 \| 0.507 \| 0.500 \| 0.509 \| 0.566 \|
	\|RiaNewsRetrieval \| NDCG@10 \| 0.140 \| 0.513 \| 0.617 \| 0.700 \| 0.702 \| 0.807 \|
	\|RuBQReranking \| MAP@10 \| 0.461 \| 0.622 \| 0.631 \| 0.715 \| 0.720 \| 0.756 \|
	\|RuBQRetrieval \| NDCG@10 \| 0.109 \| 0.517 \| 0.511 \| 0.685 \| 0.696 \| 0.741 \|
	\|RuReviewsClassification \| Accuracy \| 0.570 \| 0.607 \| 0.615 \| 0.612 \| 0.630 \| 0.653 \|
	\|RuSTSBenchmarkSTS \| Pearson correlation \| 0.694 \| 0.787 \| 0.799 \| 0.781 \| 0.796 \| 0.831 \|
	\|RuSciBenchGRNTIClassification \| Accuracy \| 0.456 \| 0.529 \| 0.544 \| 0.550 \| 0.563 \| 0.582 \|
	\|RuSciBenchGRNTIClusteringP2P \| V-measure \| 0.414 \| 0.481 \| 0.510 \| 0.511 \| 0.516 \| 0.520 \|
	\|RuSciBenchOECDClassification \| Accuracy \| 0.355 \| 0.415 \| 0.424 \| 0.427 \| 0.423 \| 0.445 \|
	\|RuSciBenchOECDClusteringP2P \| V-measure \| 0.381 \| 0.411 \| 0.438 \| 0.443 \| 0.448 \| 0.450 \|
	\|SensitiveTopicsClassification \| Accuracy \| 0.220 \| 0.244 \| 0.282 \| 0.228 \| 0.234 \| 0.257 \|
	\|TERRaClassification \| Average Precision \| 0.519 \| 0.563 \| 0.574 \| 0.551 \| 0.550 \| 0.584 \|

	\|Model Name \| Metric \| rubert-tiny2 \| [rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) \| rubert-tiny-lite \| multilingual-e5-small \| multilingual-e5-base \| multilingual-e5-large \|
	\|:----------------------------------\|:--------------------\|----------------:\|------------------:\|------------------:\|----------------------:\|----------------------:\|---------------------:\|
	\|Classification \| Accuracy \| 0.514 \| 0.535 \| 0.536 \| 0.551 \| 0.561 \| 0.588 \|
	\|Clustering \| V-measure \| 0.412 \| 0.496 \| 0.520 \| 0.513 \| 0.503 \| 0.525 \|
	\|MultiLabelClassification \| Accuracy \| 0.294 \| 0.317 \| 0.344 \| 0.314 \| 0.329 \| 0.353 \|
	\|PairClassification \| Average Precision \| 0.519 \| 0.563 \| 0.574 \| 0.551 \| 0.550 \| 0.584 \|
	\|Reranking \| MAP@10 \| 0.461 \| 0.622 \| 0.631 \| 0.715 \| 0.720 \| 0.756 \|
	\|Retrieval \| NDCG@10 \| 0.124 \| 0.515 \| 0.564 \| 0.697 \| 0.699 \| 0.774 \|
	\|STS \| Pearson correlation \| 0.694 \| 0.787 \| 0.799 \| 0.781 \| 0.796 \| 0.831 \|
	\|Average \| Average \| 0.431 \| 0.548 \| 0.567 \| 0.588 \| 0.594 \| 0.630 \|