Add new SentenceTransformer model
Browse files- README.md +81 -80
- model.safetensors +1 -1
README.md
CHANGED
@@ -4,7 +4,7 @@ tags:
|
|
4 |
- sentence-similarity
|
5 |
- feature-extraction
|
6 |
- generated_from_trainer
|
7 |
-
- dataset_size:
|
8 |
- loss:MultipleNegativesRankingLoss
|
9 |
base_model: intfloat/multilingual-e5-large
|
10 |
widget:
|
@@ -76,76 +76,76 @@ model-index:
|
|
76 |
type: unknown
|
77 |
metrics:
|
78 |
- type: cosine_accuracy@1
|
79 |
-
value: 0.
|
80 |
name: Cosine Accuracy@1
|
81 |
- type: cosine_accuracy@2
|
82 |
-
value: 0.
|
83 |
name: Cosine Accuracy@2
|
84 |
- type: cosine_accuracy@5
|
85 |
-
value: 0.
|
86 |
name: Cosine Accuracy@5
|
87 |
- type: cosine_accuracy@10
|
88 |
-
value: 0.
|
89 |
name: Cosine Accuracy@10
|
90 |
- type: cosine_accuracy@100
|
91 |
-
value: 0
|
92 |
name: Cosine Accuracy@100
|
93 |
- type: cosine_precision@1
|
94 |
-
value: 0.
|
95 |
name: Cosine Precision@1
|
96 |
- type: cosine_precision@2
|
97 |
-
value: 0.
|
98 |
name: Cosine Precision@2
|
99 |
- type: cosine_precision@5
|
100 |
-
value: 0.
|
101 |
name: Cosine Precision@5
|
102 |
- type: cosine_precision@10
|
103 |
-
value: 0.
|
104 |
name: Cosine Precision@10
|
105 |
- type: cosine_precision@100
|
106 |
-
value: 0.
|
107 |
name: Cosine Precision@100
|
108 |
- type: cosine_recall@1
|
109 |
-
value: 0.
|
110 |
name: Cosine Recall@1
|
111 |
- type: cosine_recall@2
|
112 |
-
value: 0.
|
113 |
name: Cosine Recall@2
|
114 |
- type: cosine_recall@5
|
115 |
-
value: 0.
|
116 |
name: Cosine Recall@5
|
117 |
- type: cosine_recall@10
|
118 |
-
value: 0.
|
119 |
name: Cosine Recall@10
|
120 |
- type: cosine_recall@100
|
121 |
-
value: 0
|
122 |
name: Cosine Recall@100
|
123 |
- type: cosine_ndcg@10
|
124 |
-
value: 0.
|
125 |
name: Cosine Ndcg@10
|
126 |
- type: cosine_mrr@1
|
127 |
-
value: 0.
|
128 |
name: Cosine Mrr@1
|
129 |
- type: cosine_mrr@2
|
130 |
-
value: 0.
|
131 |
name: Cosine Mrr@2
|
132 |
- type: cosine_mrr@5
|
133 |
-
value: 0.
|
134 |
name: Cosine Mrr@5
|
135 |
- type: cosine_mrr@10
|
136 |
-
value: 0.
|
137 |
name: Cosine Mrr@10
|
138 |
- type: cosine_mrr@100
|
139 |
-
value: 0.
|
140 |
name: Cosine Mrr@100
|
141 |
- type: cosine_map@100
|
142 |
-
value: 0.
|
143 |
name: Cosine Map@100
|
144 |
---
|
145 |
|
146 |
# SentenceTransformer based on intfloat/multilingual-e5-large
|
147 |
|
148 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large)
|
149 |
|
150 |
## Model Details
|
151 |
|
@@ -155,8 +155,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [i
|
|
155 |
- **Maximum Sequence Length:** 512 tokens
|
156 |
- **Output Dimensionality:** 1024 dimensions
|
157 |
- **Similarity Function:** Cosine Similarity
|
158 |
-
- **Training Dataset:**
|
159 |
-
- [word_embedding](https://huggingface.co/datasets/meandyou200175/word_embedding)
|
160 |
<!-- - **Language:** Unknown -->
|
161 |
<!-- - **License:** Unknown -->
|
162 |
|
@@ -242,28 +241,28 @@ You can finetune this model on your own dataset.
|
|
242 |
|
243 |
| Metric | Value |
|
244 |
|:---------------------|:-----------|
|
245 |
-
| cosine_accuracy@1 | 0.
|
246 |
-
| cosine_accuracy@2 | 0.
|
247 |
-
| cosine_accuracy@5 | 0.
|
248 |
-
| cosine_accuracy@10 | 0.
|
249 |
-
| cosine_accuracy@100 | 0
|
250 |
-
| cosine_precision@1 | 0.
|
251 |
-
| cosine_precision@2 | 0.
|
252 |
-
| cosine_precision@5 | 0.
|
253 |
-
| cosine_precision@10 | 0.
|
254 |
-
| cosine_precision@100 | 0.
|
255 |
-
| cosine_recall@1 | 0.
|
256 |
-
| cosine_recall@2 | 0.
|
257 |
-
| cosine_recall@5 | 0.
|
258 |
-
| cosine_recall@10 | 0.
|
259 |
-
| cosine_recall@100 | 0
|
260 |
-
| **cosine_ndcg@10** | **0.
|
261 |
-
| cosine_mrr@1 | 0.
|
262 |
-
| cosine_mrr@2 | 0.
|
263 |
-
| cosine_mrr@5 | 0.
|
264 |
-
| cosine_mrr@10 | 0.
|
265 |
-
| cosine_mrr@100 | 0.
|
266 |
-
| cosine_map@100 | 0.
|
267 |
|
268 |
<!--
|
269 |
## Bias, Risks and Limitations
|
@@ -281,10 +280,9 @@ You can finetune this model on your own dataset.
|
|
281 |
|
282 |
### Training Dataset
|
283 |
|
284 |
-
####
|
285 |
|
286 |
-
*
|
287 |
-
* Size: 9,316 training samples
|
288 |
* Columns: <code>query</code> and <code>positive</code>
|
289 |
* Approximate statistics based on the first 1000 samples:
|
290 |
| | query | positive |
|
@@ -467,35 +465,38 @@ You can finetune this model on your own dataset.
|
|
467 |
| Epoch | Step | Training Loss | Validation Loss | cosine_ndcg@10 |
|
468 |
|:------:|:----:|:-------------:|:---------------:|:--------------:|
|
469 |
| -1 | -1 | - | - | 0.7166 |
|
470 |
-
| 0.
|
471 |
-
| 0.
|
472 |
-
| 0.
|
473 |
-
| 0.
|
474 |
-
| 0.
|
475 |
-
|
|
476 |
-
| 1.
|
477 |
-
| 1.
|
478 |
-
| 1.
|
479 |
-
| 1.
|
480 |
-
| 1.
|
481 |
-
|
|
482 |
-
| 2.
|
483 |
-
| 2.
|
484 |
-
| 2.
|
485 |
-
| 2.
|
486 |
-
| 2.
|
487 |
-
|
|
488 |
-
|
|
489 |
-
| 3.
|
490 |
-
| 3.
|
491 |
-
| 3.
|
492 |
-
| 3.
|
493 |
-
|
|
494 |
-
|
|
495 |
-
| 4.
|
496 |
-
| 4.
|
497 |
-
| 4.
|
498 |
-
| 4.
|
|
|
|
|
|
|
499 |
|
500 |
|
501 |
### Framework Versions
|
|
|
4 |
- sentence-similarity
|
5 |
- feature-extraction
|
6 |
- generated_from_trainer
|
7 |
+
- dataset_size:10356
|
8 |
- loss:MultipleNegativesRankingLoss
|
9 |
base_model: intfloat/multilingual-e5-large
|
10 |
widget:
|
|
|
76 |
type: unknown
|
77 |
metrics:
|
78 |
- type: cosine_accuracy@1
|
79 |
+
value: 0.9073359073359073
|
80 |
name: Cosine Accuracy@1
|
81 |
- type: cosine_accuracy@2
|
82 |
+
value: 0.9739382239382239
|
83 |
name: Cosine Accuracy@2
|
84 |
- type: cosine_accuracy@5
|
85 |
+
value: 0.9942084942084942
|
86 |
name: Cosine Accuracy@5
|
87 |
- type: cosine_accuracy@10
|
88 |
+
value: 0.999034749034749
|
89 |
name: Cosine Accuracy@10
|
90 |
- type: cosine_accuracy@100
|
91 |
+
value: 1.0
|
92 |
name: Cosine Accuracy@100
|
93 |
- type: cosine_precision@1
|
94 |
+
value: 0.9073359073359073
|
95 |
name: Cosine Precision@1
|
96 |
- type: cosine_precision@2
|
97 |
+
value: 0.48696911196911197
|
98 |
name: Cosine Precision@2
|
99 |
- type: cosine_precision@5
|
100 |
+
value: 0.19884169884169883
|
101 |
name: Cosine Precision@5
|
102 |
- type: cosine_precision@10
|
103 |
+
value: 0.0999034749034749
|
104 |
name: Cosine Precision@10
|
105 |
- type: cosine_precision@100
|
106 |
+
value: 0.010000000000000002
|
107 |
name: Cosine Precision@100
|
108 |
- type: cosine_recall@1
|
109 |
+
value: 0.9073359073359073
|
110 |
name: Cosine Recall@1
|
111 |
- type: cosine_recall@2
|
112 |
+
value: 0.9739382239382239
|
113 |
name: Cosine Recall@2
|
114 |
- type: cosine_recall@5
|
115 |
+
value: 0.9942084942084942
|
116 |
name: Cosine Recall@5
|
117 |
- type: cosine_recall@10
|
118 |
+
value: 0.999034749034749
|
119 |
name: Cosine Recall@10
|
120 |
- type: cosine_recall@100
|
121 |
+
value: 1.0
|
122 |
name: Cosine Recall@100
|
123 |
- type: cosine_ndcg@10
|
124 |
+
value: 0.9601842774877813
|
125 |
name: Cosine Ndcg@10
|
126 |
- type: cosine_mrr@1
|
127 |
+
value: 0.9073359073359073
|
128 |
name: Cosine Mrr@1
|
129 |
- type: cosine_mrr@2
|
130 |
+
value: 0.9406370656370656
|
131 |
name: Cosine Mrr@2
|
132 |
- type: cosine_mrr@5
|
133 |
+
value: 0.9462837837837839
|
134 |
name: Cosine Mrr@5
|
135 |
- type: cosine_mrr@10
|
136 |
+
value: 0.946988570202856
|
137 |
name: Cosine Mrr@10
|
138 |
- type: cosine_mrr@100
|
139 |
+
value: 0.9470763202906061
|
140 |
name: Cosine Mrr@100
|
141 |
- type: cosine_map@100
|
142 |
+
value: 0.9470763202906061
|
143 |
name: Cosine Map@100
|
144 |
---
|
145 |
|
146 |
# SentenceTransformer based on intfloat/multilingual-e5-large
|
147 |
|
148 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
149 |
|
150 |
## Model Details
|
151 |
|
|
|
155 |
- **Maximum Sequence Length:** 512 tokens
|
156 |
- **Output Dimensionality:** 1024 dimensions
|
157 |
- **Similarity Function:** Cosine Similarity
|
158 |
+
<!-- - **Training Dataset:** Unknown -->
|
|
|
159 |
<!-- - **Language:** Unknown -->
|
160 |
<!-- - **License:** Unknown -->
|
161 |
|
|
|
241 |
|
242 |
| Metric | Value |
|
243 |
|:---------------------|:-----------|
|
244 |
+
| cosine_accuracy@1 | 0.9073 |
|
245 |
+
| cosine_accuracy@2 | 0.9739 |
|
246 |
+
| cosine_accuracy@5 | 0.9942 |
|
247 |
+
| cosine_accuracy@10 | 0.999 |
|
248 |
+
| cosine_accuracy@100 | 1.0 |
|
249 |
+
| cosine_precision@1 | 0.9073 |
|
250 |
+
| cosine_precision@2 | 0.487 |
|
251 |
+
| cosine_precision@5 | 0.1988 |
|
252 |
+
| cosine_precision@10 | 0.0999 |
|
253 |
+
| cosine_precision@100 | 0.01 |
|
254 |
+
| cosine_recall@1 | 0.9073 |
|
255 |
+
| cosine_recall@2 | 0.9739 |
|
256 |
+
| cosine_recall@5 | 0.9942 |
|
257 |
+
| cosine_recall@10 | 0.999 |
|
258 |
+
| cosine_recall@100 | 1.0 |
|
259 |
+
| **cosine_ndcg@10** | **0.9602** |
|
260 |
+
| cosine_mrr@1 | 0.9073 |
|
261 |
+
| cosine_mrr@2 | 0.9406 |
|
262 |
+
| cosine_mrr@5 | 0.9463 |
|
263 |
+
| cosine_mrr@10 | 0.947 |
|
264 |
+
| cosine_mrr@100 | 0.9471 |
|
265 |
+
| cosine_map@100 | 0.9471 |
|
266 |
|
267 |
<!--
|
268 |
## Bias, Risks and Limitations
|
|
|
280 |
|
281 |
### Training Dataset
|
282 |
|
283 |
+
#### Unnamed Dataset
|
284 |
|
285 |
+
* Size: 10,356 training samples
|
|
|
286 |
* Columns: <code>query</code> and <code>positive</code>
|
287 |
* Approximate statistics based on the first 1000 samples:
|
288 |
| | query | positive |
|
|
|
465 |
| Epoch | Step | Training Loss | Validation Loss | cosine_ndcg@10 |
|
466 |
|:------:|:----:|:-------------:|:---------------:|:--------------:|
|
467 |
| -1 | -1 | - | - | 0.7166 |
|
468 |
+
| 0.1543 | 100 | 0.9191 | - | - |
|
469 |
+
| 0.3086 | 200 | 0.1876 | - | - |
|
470 |
+
| 0.4630 | 300 | 0.1547 | - | - |
|
471 |
+
| 0.6173 | 400 | 0.1556 | - | - |
|
472 |
+
| 0.7716 | 500 | 0.179 | - | - |
|
473 |
+
| 0.9259 | 600 | 0.1234 | - | - |
|
474 |
+
| 1.0802 | 700 | 0.087 | - | - |
|
475 |
+
| 1.2346 | 800 | 0.0576 | - | - |
|
476 |
+
| 1.3889 | 900 | 0.0564 | - | - |
|
477 |
+
| 1.5432 | 1000 | 0.0583 | 0.0271 | 0.9198 |
|
478 |
+
| 1.6975 | 1100 | 0.0764 | - | - |
|
479 |
+
| 1.8519 | 1200 | 0.0493 | - | - |
|
480 |
+
| 2.0062 | 1300 | 0.0481 | - | - |
|
481 |
+
| 2.1605 | 1400 | 0.0222 | - | - |
|
482 |
+
| 2.3148 | 1500 | 0.0234 | - | - |
|
483 |
+
| 2.4691 | 1600 | 0.0283 | - | - |
|
484 |
+
| 2.6235 | 1700 | 0.0236 | - | - |
|
485 |
+
| 2.7778 | 1800 | 0.026 | - | - |
|
486 |
+
| 2.9321 | 1900 | 0.0217 | - | - |
|
487 |
+
| 3.0864 | 2000 | 0.0193 | 0.0061 | 0.9534 |
|
488 |
+
| 3.2407 | 2100 | 0.0135 | - | - |
|
489 |
+
| 3.3951 | 2200 | 0.0162 | - | - |
|
490 |
+
| 3.5494 | 2300 | 0.0109 | - | - |
|
491 |
+
| 3.7037 | 2400 | 0.0107 | - | - |
|
492 |
+
| 3.8580 | 2500 | 0.0105 | - | - |
|
493 |
+
| 4.0123 | 2600 | 0.0095 | - | - |
|
494 |
+
| 4.1667 | 2700 | 0.0146 | - | - |
|
495 |
+
| 4.3210 | 2800 | 0.0102 | - | - |
|
496 |
+
| 4.4753 | 2900 | 0.0108 | - | - |
|
497 |
+
| 4.6296 | 3000 | 0.01 | 0.0061 | 0.9602 |
|
498 |
+
| 4.7840 | 3100 | 0.008 | - | - |
|
499 |
+
| 4.9383 | 3200 | 0.0117 | - | - |
|
500 |
|
501 |
|
502 |
### Framework Versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 2239607176
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8a156fd71ddd697aac4e8c39e7714f28d7852c7d782f6b21c821d0f433937f07
|
3 |
size 2239607176
|