sbintuitions
/

sarashina-embedding-v2-1b

@@ -18,7 +18,7 @@ inference: false
 **[日本語のREADME/Japanese README](https://huggingface.co/sbintuitions/sarashina-embedding-v2-1b/blob/main/README_JA.md)**
 "Sarashina-Embedding-v2-1B" is a Japanese text embedding model, based on the Japanese LLM "[Sarashina2.2-1B](https://huggingface.co/sbintuitions/sarashina2.2-1b)".
-We trained this model with multi-stage contrastive learning. We achieved the state-of-the-art average score across 28 datasets in  [JMTEB](https://huggingface.co/datasets/sbintuitions/JMTEB) (Japanese Massive Text Embedding Benchmark).(Benchmarked on July 28,2025.)
 This model maps sentences & paragraphs to a 1792-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and other applications.
@@ -76,7 +76,7 @@ print(similarities)
 ```
 ### How to add instructions and prefixes
-For both the query and document sides, use different prefix formats. On the query side, add the prefix `task:` followed by instructions. (*Only for STS tasks: The document side should also use the same format of instruction and prefix as the query side.)
   - Query Side: ```task: {Instrcution}\nquery: {Query}```
   - Document Side: ```text: {Document}```
@@ -87,8 +87,8 @@ The table below provides instruction and prefix templates for five main tasks.
 |Task|Query Side|Document Side|
 |:-:|:-|:-|
 |Retrieval<br>Reranking|task: 質問を与えるので、その質問に答えるのに役立つ関連文書を検索してください。\nquery: |text: |
-|Clustering|task: 与えられたドキュメントのトピックまたはテーマを特定してください。\nquery: |text: |
-|Classification|task: 与えられたレビューを適切な評価カテゴリに分類してください。\nquery: |text: |
 |STS|task: クエリを与えるので，もっともクエリに意味が似ている一節を探してください。\nquery: |task: クエリを与えるので，もっともクエリに意味が似ている一節を探してください。\nquery: |
 ## Training

 **[日本語のREADME/Japanese README](https://huggingface.co/sbintuitions/sarashina-embedding-v2-1b/blob/main/README_JA.md)**
 "Sarashina-Embedding-v2-1B" is a Japanese text embedding model, based on the Japanese LLM "[Sarashina2.2-1B](https://huggingface.co/sbintuitions/sarashina2.2-1b)".
+We trained this model with multi-stage contrastive learning. We achieved the state-of-the-art average score across 28 datasets in  [JMTEB](https://huggingface.co/datasets/sbintuitions/JMTEB) (Japanese Massive Text Embedding Benchmark).(Benchmarked on July 28, 2025. )
 This model maps sentences & paragraphs to a 1792-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and other applications.
 ```
 ### How to add instructions and prefixes
+For both the query and document sides, use different prefix formats. On the query side, add the prefix `task:` followed by instructions. (Only for STS task, both sentences are considered as query, and should be prefixed with the same instruction.)
   - Query Side: ```task: {Instrcution}\nquery: {Query}```
   - Document Side: ```text: {Document}```
 |Task|Query Side|Document Side|
 |:-:|:-|:-|
 |Retrieval<br>Reranking|task: 質問を与えるので、その質問に答えるのに役立つ関連文書を検索してください。\nquery: |text: |
+|Clustering|task: 与えられたドキュメントのトピックまたはテーマを特定してください。\nquery: | - |
+|Classification|task: 与えられたレビューを適切な評価カテゴリに分類してください。\nquery: | - |
 |STS|task: クエリを与えるので，もっともクエリに意味が似ている一節を探してください。\nquery: |task: クエリを与えるので，もっともクエリに意味が似ている一節を探してください。\nquery: |
 ## Training