fix minor issues
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ inference: false
|
|
18 |
**[日本語のREADME/Japanese README](https://huggingface.co/sbintuitions/sarashina-embedding-v2-1b/blob/main/README_JA.md)**
|
19 |
|
20 |
"Sarashina-Embedding-v2-1B" is a Japanese text embedding model, based on the Japanese LLM "[Sarashina2.2-1B](https://huggingface.co/sbintuitions/sarashina2.2-1b)".
|
21 |
-
We trained this model with multi-stage contrastive learning. We achieved the state-of-the-art average score across 28 datasets in [JMTEB](https://huggingface.co/datasets/sbintuitions/JMTEB) (Japanese Massive Text Embedding Benchmark).(Benchmarked on July 28,2025.)
|
22 |
|
23 |
This model maps sentences & paragraphs to a 1792-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and other applications.
|
24 |
|
@@ -76,7 +76,7 @@ print(similarities)
|
|
76 |
```
|
77 |
### How to add instructions and prefixes
|
78 |
|
79 |
-
For both the query and document sides, use different prefix formats. On the query side, add the prefix `task:` followed by instructions. (
|
80 |
|
81 |
- Query Side: ```task: {Instrcution}\nquery: {Query}```
|
82 |
- Document Side: ```text: {Document}```
|
@@ -87,8 +87,8 @@ The table below provides instruction and prefix templates for five main tasks.
|
|
87 |
|Task|Query Side|Document Side|
|
88 |
|:-:|:-|:-|
|
89 |
|Retrieval<br>Reranking|task: 質問を与えるので、その質問に答えるのに役立つ関連文書を検索してください。\nquery: |text: |
|
90 |
-
|Clustering|task: 与えられたドキュメントのトピックまたはテーマを特定してください。\nquery: |
|
91 |
-
|Classification|task: 与えられたレビューを適切な評価カテゴリに分類してください。\nquery: |
|
92 |
|STS|task: クエリを与えるので,もっともクエリに意味が似ている一節を探してください。\nquery: |task: クエリを与えるので,もっともクエリに意味が似ている一節を探してください。\nquery: |
|
93 |
|
94 |
## Training
|
|
|
18 |
**[日本語のREADME/Japanese README](https://huggingface.co/sbintuitions/sarashina-embedding-v2-1b/blob/main/README_JA.md)**
|
19 |
|
20 |
"Sarashina-Embedding-v2-1B" is a Japanese text embedding model, based on the Japanese LLM "[Sarashina2.2-1B](https://huggingface.co/sbintuitions/sarashina2.2-1b)".
|
21 |
+
We trained this model with multi-stage contrastive learning. We achieved the state-of-the-art average score across 28 datasets in [JMTEB](https://huggingface.co/datasets/sbintuitions/JMTEB) (Japanese Massive Text Embedding Benchmark).(Benchmarked on July 28, 2025. )
|
22 |
|
23 |
This model maps sentences & paragraphs to a 1792-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and other applications.
|
24 |
|
|
|
76 |
```
|
77 |
### How to add instructions and prefixes
|
78 |
|
79 |
+
For both the query and document sides, use different prefix formats. On the query side, add the prefix `task:` followed by instructions. (Only for STS task, both sentences are considered as query, and should be prefixed with the same instruction.)
|
80 |
|
81 |
- Query Side: ```task: {Instrcution}\nquery: {Query}```
|
82 |
- Document Side: ```text: {Document}```
|
|
|
87 |
|Task|Query Side|Document Side|
|
88 |
|:-:|:-|:-|
|
89 |
|Retrieval<br>Reranking|task: 質問を与えるので、その質問に答えるのに役立つ関連文書を検索してください。\nquery: |text: |
|
90 |
+
|Clustering|task: 与えられたドキュメントのトピックまたはテーマを特定してください。\nquery: | - |
|
91 |
+
|Classification|task: 与えられたレビューを適切な評価カテゴリに分類してください。\nquery: | - |
|
92 |
|STS|task: クエリを与えるので,もっともクエリに意味が似ている一節を探してください。\nquery: |task: クエリを与えるので,もっともクエリに意味が似ている一節を探してください。\nquery: |
|
93 |
|
94 |
## Training
|