Sraym commited on
Commit
5211f3c
·
verified ·
1 Parent(s): cf855b4

fix minor issues

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -18,7 +18,7 @@ inference: false
18
  **[日本語のREADME/Japanese README](https://huggingface.co/sbintuitions/sarashina-embedding-v2-1b/blob/main/README_JA.md)**
19
 
20
  "Sarashina-Embedding-v2-1B" is a Japanese text embedding model, based on the Japanese LLM "[Sarashina2.2-1B](https://huggingface.co/sbintuitions/sarashina2.2-1b)".
21
- We trained this model with multi-stage contrastive learning. We achieved the state-of-the-art average score across 28 datasets in [JMTEB](https://huggingface.co/datasets/sbintuitions/JMTEB) (Japanese Massive Text Embedding Benchmark).(Benchmarked on July 28,2025.)
22
 
23
  This model maps sentences & paragraphs to a 1792-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and other applications.
24
 
@@ -76,7 +76,7 @@ print(similarities)
76
  ```
77
  ### How to add instructions and prefixes
78
 
79
- For both the query and document sides, use different prefix formats. On the query side, add the prefix `task:` followed by instructions. (*Only for STS tasks: The document side should also use the same format of instruction and prefix as the query side.)
80
 
81
  - Query Side: ```task: {Instrcution}\nquery: {Query}```
82
  - Document Side: ```text: {Document}```
@@ -87,8 +87,8 @@ The table below provides instruction and prefix templates for five main tasks.
87
  |Task|Query Side|Document Side|
88
  |:-:|:-|:-|
89
  |Retrieval<br>Reranking|task: 質問を与えるので、その質問に答えるのに役立つ関連文書を検索してください。\nquery: |text: |
90
- |Clustering|task: 与えられたドキュメントのトピックまたはテーマを特定してください。\nquery: |text: |
91
- |Classification|task: 与えられたレビューを適切な評価カテゴリに分類してください。\nquery: |text: |
92
  |STS|task: クエリを与えるので,もっともクエリに意味が似ている一節を探してください。\nquery: |task: クエリを与えるので,もっともクエリに意味が似ている一節を探してください。\nquery: |
93
 
94
  ## Training
 
18
  **[日本語のREADME/Japanese README](https://huggingface.co/sbintuitions/sarashina-embedding-v2-1b/blob/main/README_JA.md)**
19
 
20
  "Sarashina-Embedding-v2-1B" is a Japanese text embedding model, based on the Japanese LLM "[Sarashina2.2-1B](https://huggingface.co/sbintuitions/sarashina2.2-1b)".
21
+ We trained this model with multi-stage contrastive learning. We achieved the state-of-the-art average score across 28 datasets in [JMTEB](https://huggingface.co/datasets/sbintuitions/JMTEB) (Japanese Massive Text Embedding Benchmark).(Benchmarked on July 28, 2025. )
22
 
23
  This model maps sentences & paragraphs to a 1792-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and other applications.
24
 
 
76
  ```
77
  ### How to add instructions and prefixes
78
 
79
+ For both the query and document sides, use different prefix formats. On the query side, add the prefix `task:` followed by instructions. (Only for STS task, both sentences are considered as query, and should be prefixed with the same instruction.)
80
 
81
  - Query Side: ```task: {Instrcution}\nquery: {Query}```
82
  - Document Side: ```text: {Document}```
 
87
  |Task|Query Side|Document Side|
88
  |:-:|:-|:-|
89
  |Retrieval<br>Reranking|task: 質問を与えるので、その質問に答えるのに役立つ関連文書を検索してください。\nquery: |text: |
90
+ |Clustering|task: 与えられたドキュメントのトピックまたはテーマを特定してください。\nquery: | - |
91
+ |Classification|task: 与えられたレビューを適切な評価カテゴリに分類してください。\nquery: | - |
92
  |STS|task: クエリを与えるので,もっともクエリに意味が似ている一節を探してください。\nquery: |task: クエリを与えるので,もっともクエリに意味が似ている一節を探してください。\nquery: |
93
 
94
  ## Training