Add new SentenceTransformer model
Browse files- 1_Dense/model.safetensors +1 -1
- README.md +46 -44
- model.safetensors +1 -1
1_Dense/model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 131160
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8f43f95060742094f81fc04580f6668f39a89e0dfd952b4d9bab0b8ee94e7ef5
|
3 |
size 131160
|
README.md
CHANGED
@@ -6,7 +6,7 @@ tags:
|
|
6 |
- sentence-similarity
|
7 |
- feature-extraction
|
8 |
- generated_from_trainer
|
9 |
-
- dataset_size:
|
10 |
- loss:Contrastive
|
11 |
base_model: jhu-clsp/ettin-encoder-17m
|
12 |
pipeline_tag: sentence-similarity
|
@@ -24,7 +24,7 @@ model-index:
|
|
24 |
type: unknown
|
25 |
metrics:
|
26 |
- type: accuracy
|
27 |
-
value: 0.
|
28 |
name: Accuracy
|
29 |
---
|
30 |
|
@@ -216,9 +216,9 @@ You can finetune this model on your own dataset.
|
|
216 |
|
217 |
* Evaluated with <code>pylate.evaluation.colbert_triplet.ColBERTTripletEvaluator</code>
|
218 |
|
219 |
-
| Metric | Value
|
220 |
-
|
221 |
-
| **accuracy** | **0.
|
222 |
|
223 |
<!--
|
224 |
## Bias, Risks and Limitations
|
@@ -239,19 +239,19 @@ You can finetune this model on your own dataset.
|
|
239 |
#### Unnamed Dataset
|
240 |
|
241 |
|
242 |
-
* Size:
|
243 |
* Columns: <code>query</code>, <code>positive</code>, <code>negative_1</code>, <code>negative_2</code>, and <code>negative_3</code>
|
244 |
* Approximate statistics based on the first 1000 samples:
|
245 |
| | query | positive | negative_1 | negative_2 | negative_3 |
|
246 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
|
247 |
| type | string | string | string | string | string |
|
248 |
-
| details | <ul><li>min:
|
249 |
* Samples:
|
250 |
-
| query
|
251 |
-
|
252 |
-
| <code>
|
253 |
-
| <code>
|
254 |
-
| <code>average
|
255 |
* Loss: <code>pylate.losses.contrastive.Contrastive</code>
|
256 |
|
257 |
### Evaluation Dataset
|
@@ -262,29 +262,30 @@ You can finetune this model on your own dataset.
|
|
262 |
* Size: 20,000 evaluation samples
|
263 |
* Columns: <code>query</code>, <code>positive</code>, <code>negative_1</code>, <code>negative_2</code>, and <code>negative_3</code>
|
264 |
* Approximate statistics based on the first 1000 samples:
|
265 |
-
| | query | positive | negative_1
|
266 |
-
|
267 |
-
| type | string | string | string
|
268 |
-
| details | <ul><li>min:
|
269 |
* Samples:
|
270 |
-
| query
|
271 |
-
|
272 |
-
| <code>
|
273 |
-
| <code>
|
274 |
-
| <code>
|
275 |
* Loss: <code>pylate.losses.contrastive.Contrastive</code>
|
276 |
|
277 |
### Training Hyperparameters
|
278 |
#### Non-Default Hyperparameters
|
279 |
|
280 |
- `eval_strategy`: epoch
|
281 |
-
- `per_device_train_batch_size`:
|
282 |
-
- `per_device_eval_batch_size`:
|
283 |
-
- `learning_rate`:
|
284 |
- `num_train_epochs`: 4
|
285 |
- `lr_scheduler_type`: cosine
|
286 |
- `warmup_ratio`: 0.05
|
287 |
- `fp16`: True
|
|
|
288 |
- `push_to_hub`: True
|
289 |
|
290 |
#### All Hyperparameters
|
@@ -294,14 +295,14 @@ You can finetune this model on your own dataset.
|
|
294 |
- `do_predict`: False
|
295 |
- `eval_strategy`: epoch
|
296 |
- `prediction_loss_only`: True
|
297 |
-
- `per_device_train_batch_size`:
|
298 |
-
- `per_device_eval_batch_size`:
|
299 |
- `per_gpu_train_batch_size`: None
|
300 |
- `per_gpu_eval_batch_size`: None
|
301 |
- `gradient_accumulation_steps`: 1
|
302 |
- `eval_accumulation_steps`: None
|
303 |
- `torch_empty_cache_steps`: None
|
304 |
-
- `learning_rate`:
|
305 |
- `weight_decay`: 0.0
|
306 |
- `adam_beta1`: 0.9
|
307 |
- `adam_beta2`: 0.999
|
@@ -347,7 +348,7 @@ You can finetune this model on your own dataset.
|
|
347 |
- `disable_tqdm`: False
|
348 |
- `remove_unused_columns`: True
|
349 |
- `label_names`: None
|
350 |
-
- `load_best_model_at_end`:
|
351 |
- `ignore_data_skip`: False
|
352 |
- `fsdp`: []
|
353 |
- `fsdp_min_num_params`: 0
|
@@ -409,22 +410,23 @@ You can finetune this model on your own dataset.
|
|
409 |
</details>
|
410 |
|
411 |
### Training Logs
|
412 |
-
| Epoch
|
413 |
-
|
414 |
-
| 1.0
|
415 |
-
| 0
|
416 |
-
| 1.0
|
417 |
-
| 2.0
|
418 |
-
| 0
|
419 |
-
| 2.0
|
420 |
-
| 3.0
|
421 |
-
| 0
|
422 |
-
| 3.0
|
423 |
-
| 4.0
|
424 |
-
| 0
|
425 |
-
| 4.0
|
426 |
-
| 0
|
427 |
-
|
|
|
428 |
|
429 |
### Framework Versions
|
430 |
- Python: 3.11.11
|
|
|
6 |
- sentence-similarity
|
7 |
- feature-extraction
|
8 |
- generated_from_trainer
|
9 |
+
- dataset_size:972246
|
10 |
- loss:Contrastive
|
11 |
base_model: jhu-clsp/ettin-encoder-17m
|
12 |
pipeline_tag: sentence-similarity
|
|
|
24 |
type: unknown
|
25 |
metrics:
|
26 |
- type: accuracy
|
27 |
+
value: 0.7980000376701355
|
28 |
name: Accuracy
|
29 |
---
|
30 |
|
|
|
216 |
|
217 |
* Evaluated with <code>pylate.evaluation.colbert_triplet.ColBERTTripletEvaluator</code>
|
218 |
|
219 |
+
| Metric | Value |
|
220 |
+
|:-------------|:----------|
|
221 |
+
| **accuracy** | **0.798** |
|
222 |
|
223 |
<!--
|
224 |
## Bias, Risks and Limitations
|
|
|
239 |
#### Unnamed Dataset
|
240 |
|
241 |
|
242 |
+
* Size: 972,246 training samples
|
243 |
* Columns: <code>query</code>, <code>positive</code>, <code>negative_1</code>, <code>negative_2</code>, and <code>negative_3</code>
|
244 |
* Approximate statistics based on the first 1000 samples:
|
245 |
| | query | positive | negative_1 | negative_2 | negative_3 |
|
246 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
|
247 |
| type | string | string | string | string | string |
|
248 |
+
| details | <ul><li>min: 5 tokens</li><li>mean: 10.04 tokens</li><li>max: 22 tokens</li></ul> | <ul><li>min: 21 tokens</li><li>mean: 31.92 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 31.92 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 31.92 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 22 tokens</li><li>mean: 31.94 tokens</li><li>max: 32 tokens</li></ul> |
|
249 |
* Samples:
|
250 |
+
| query | positive | negative_1 | negative_2 | negative_3 |
|
251 |
+
|:-----------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
252 |
+
| <code>what diseases does sugar cause</code> | <code>Eating too much sugar raises your risk for gaining weight and the health problems that are associated with being overweight. You are more likely to suffer diabetes, heart disease, high blood pressure, cancer and many other health conditions when you indulge your sweet tooth too often.Table sugar isn’t the only culprit when it comes to sugar.ou are more likely to suffer diabetes, heart disease, high blood pressure, cancer and many other health conditions when you indulge your sweet tooth too often. Table sugar isn’t the only culprit when it comes to sugar.</code> | <code>Sugar and Heart Disease. Lately, we’ve seen epidemiological data suggesting that increased intake of sugar-sweetened beverages increases the risk for metabolic syndrome, type 2 diabetes, coronary heart disease, and stroke (5).ugar and Heart Disease. Lately, we’ve seen epidemiological data suggesting that increased intake of sugar-sweetened beverages increases the risk for metabolic syndrome, type 2 diabetes, coronary heart disease, and stroke (5).</code> | <code>High intake of sugar and refined carbohydrates is associated with increased risk of diabetes, metabolic syndrome, non-alcoholic fatty liver disease, lipid disorders and high blood pressure.ugar and Heart Disease. Lately, we’ve seen epidemiological data suggesting that increased intake of sugar-sweetened beverages increases the risk for metabolic syndrome, type 2 diabetes, coronary heart disease, and stroke (5).</code> | <code>Sugar Disease is a problem that manifests in different ways in different individuals, of different ages and of different genetic susceptibility-but its three cardinal forms are:he glycemic index: key to diet for Sugar Disease. Some have proposed that persons with variants of Sugar Disease follow a diet that rigidly excludes carbohydrates, concentrating instead on meat and vegetables. In my opinion this is rarely necessary and results in dietary imbalances.</code> |
|
253 |
+
| <code>what diseases does sugar cause</code> | <code>Eating too much sugar raises your risk for gaining weight and the health problems that are associated with being overweight. You are more likely to suffer diabetes, heart disease, high blood pressure, cancer and many other health conditions when you indulge your sweet tooth too often.Table sugar isn’t the only culprit when it comes to sugar.ou are more likely to suffer diabetes, heart disease, high blood pressure, cancer and many other health conditions when you indulge your sweet tooth too often. Table sugar isn’t the only culprit when it comes to sugar.</code> | <code>Another mechanism whereby sugar consumption may increase the risk of cardiovascular disease is through its effects on blood pressure. It is well known that high blood pressure increases the risk for cardiovascular disease.ugar and Heart Disease. Lately, we’ve seen epidemiological data suggesting that increased intake of sugar-sweetened beverages increases the risk for metabolic syndrome, type 2 diabetes, coronary heart disease, and stroke (5).</code> | <code>The term Sugar Disease is a convenient catch-all for a host of modern conditions that result from an unbridled intake of sugar or refined carbohydrates coupled with a sedentary lifestyle.he glycemic index: key to diet for Sugar Disease. Some have proposed that persons with variants of Sugar Disease follow a diet that rigidly excludes carbohydrates, concentrating instead on meat and vegetables. In my opinion this is rarely necessary and results in dietary imbalances.</code> | <code>Brown sugar is just sucrose with molasses – same basic composition. Glucose, or blood sugar, is the sugar that circulates in your blood. Fructose, or fruit sugar, is found in plants and honey. It’s the fructose in sugar that causes the problem, as you will see.That doesn’t mean you shouldn’t eat whole fruit; whole fruit contains fiber that slows digestion.It does mean that fruit juice poses a danger.t’s not a fact that sugar causes cancer, and Lustig does not become an absolute authority by virtue of his lecture going viral. It’s not the way medical science works, or should work: web virality implies popularity, not truth.</code> |
|
254 |
+
| <code>average cost per square foot to build a house</code> | <code>Generally, turnkey costs will start at around $70 a square foot for a starter home in ideal conditions. An average level of finish will be more like $90-100. At $110-120 we are building a custom finish.</code> | <code>Home prices of $55, $66, $72, $80, $84, $92, $110, $118, and $328 per square foot combine to produce an average of $112 per square foot, which is probably a reasonable figure for many areas of the country. However, the difference between the lowest figure and the highest is very substantial.</code> | <code>An average commercial steel building costs between $16 and $20 per square foot, including building package (I-Beams, purlins, girts etc.) , delivery, foundation and the cost of construction.</code> | <code>At first glance, the average home building cost per square foot seems extremely high. People who do a lot of home improvement jobs are usually the first ones to question, because they know the cost of materials. The fact is that the materials are about 25-33 percent of the cost of a house.</code> |
|
255 |
* Loss: <code>pylate.losses.contrastive.Contrastive</code>
|
256 |
|
257 |
### Evaluation Dataset
|
|
|
262 |
* Size: 20,000 evaluation samples
|
263 |
* Columns: <code>query</code>, <code>positive</code>, <code>negative_1</code>, <code>negative_2</code>, and <code>negative_3</code>
|
264 |
* Approximate statistics based on the first 1000 samples:
|
265 |
+
| | query | positive | negative_1 | negative_2 | negative_3 |
|
266 |
+
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
|
267 |
+
| type | string | string | string | string | string |
|
268 |
+
| details | <ul><li>min: 5 tokens</li><li>mean: 10.12 tokens</li><li>max: 20 tokens</li></ul> | <ul><li>min: 28 tokens</li><li>mean: 31.99 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 31.9 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 31.95 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 31.92 tokens</li><li>max: 32 tokens</li></ul> |
|
269 |
* Samples:
|
270 |
+
| query | positive | negative_1 | negative_2 | negative_3 |
|
271 |
+
|:--------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
272 |
+
| <code>what are the dimensions of a regulation nba backboard?</code> | <code>What is the size of a basketball backboard? Per NBA regulations, the dimensions of a basketball backboard are 6 feet wide by 3 ½ feet high. The backboard is marked with a 2-inch white rectangle that is centered behind the ring; the rectangle's outer dimensions are 24 inches wide by 18 inches high. The basket itself consists of a metal ring with an 18-inch inner diameter and a white cord net that is 15 to 18 inches long.</code> | <code>The Backboard and Rim: The regulation distance from the ground to the top of the rim is 10 feet for all levels of play. Regulation backboards are 6 feet wide (72 inches) by 42 inches tall. All basketball rims (hoops) are 18 inches in diameter. The inner square on the backboard is 24 inches wide by 18 inches tall. All line markings on the floor are 2 inches wide and can vary in color.</code> | <code>72 Backboard (Regulation) 72 x 42 backboard systems are the best choice if you're a serious basketball enthusiast or a former high school or college competitor. 1 The official, regulation size used in high school, college, and NBA competition. Ideal system for a dedicated backyard court or a large 3 car garage driveway.</code> | <code>Each basket ring shall be securely attached to the backboard with its upper edge 10' above and parallel to the floor and equidistant from the vertical edges of the board. The nearest point of the inside edge of the ring shall be 6 from the plane of the face of the board.</code> |
|
273 |
+
| <code>what are the dimensions of a regulation nba backboard?</code> | <code>What is the size of a basketball backboard? Per NBA regulations, the dimensions of a basketball backboard are 6 feet wide by 3 ½ feet high. The backboard is marked with a 2-inch white rectangle that is centered behind the ring; the rectangle's outer dimensions are 24 inches wide by 18 inches high. The basket itself consists of a metal ring with an 18-inch inner diameter and a white cord net that is 15 to 18 inches long.</code> | <code>72 Backboard (Regulation) 72 x 42 backboard systems are the best choice if you're a serious basketball enthusiast or a former high school or college competitor. The official, regulation size used in high school, college, and NBA competition.</code> | <code>What are the standard measurements for a basketball backboard? The standard size for a basketball backboard is 6 feet wide and 3 1/2 feet high. The surface should be flat and usually there is a 2-inch thick rectangle above the rim that is 24 inches wide and 18 inches high.</code> | <code>The areas identified by the lane space markings are 2 by 8 inches and the neutral zone marks are 12 by 8. c. A free throw line shall be drawn (2 wide) across each of the circles indicated in the court diagram. It shall be parallel to the end line and shall be 15' from the plane of the face of the backboard.</code> |
|
274 |
+
| <code>what is a usb receiver</code> | <code>The Logitech Unifying receiver is a miniaturised dedicated USB wireless receiver which permits up to 6 devices such as mice and keyboards (headphones are not compatible), which must be made by Logitech and of compatible design, to be linked to the same computer using 2.4 GHz band radio communication in a way very similar to, but incompatible with ...</code> | <code>Wireless USB. Wireless USB is a short-range, high-bandwidth wireless radio communication protocol created by the Wireless USB Promoter Group which intends to further increase the availability of general USB-based technologies. It is maintained by the WiMedia Alliance and (as of 2009) the current revision is 1.0, which was approved in 2005.</code> | <code>Replacement USB RF receiver for current Air Mouse Elite and Air Mouse... Replacement USB RF receiver for current Air Mouse Elite and Air Mouse GO Plus products. Enables a 100-foot (30-meter) wireless range. Before ordering for GO Plus, confirm that your mouse unit is labeled AS04130. Consider purchasing two to ensure uninterrupted Air Mouse operation.</code> | <code>VicTsing MM057 2.4G Wireless Portable Mobile Mouse Optical Mice with USB Receiver, 5 Adjustable DPI Levels, 6... by VicTsing. $ 9 99 $19.99Prime. Get it by Tomorrow, Apr 21. 50% off item with purchase of 1 items. 15% off item with purchase of 1 items. See Details.</code> |
|
275 |
* Loss: <code>pylate.losses.contrastive.Contrastive</code>
|
276 |
|
277 |
### Training Hyperparameters
|
278 |
#### Non-Default Hyperparameters
|
279 |
|
280 |
- `eval_strategy`: epoch
|
281 |
+
- `per_device_train_batch_size`: 64
|
282 |
+
- `per_device_eval_batch_size`: 64
|
283 |
+
- `learning_rate`: 8e-06
|
284 |
- `num_train_epochs`: 4
|
285 |
- `lr_scheduler_type`: cosine
|
286 |
- `warmup_ratio`: 0.05
|
287 |
- `fp16`: True
|
288 |
+
- `load_best_model_at_end`: True
|
289 |
- `push_to_hub`: True
|
290 |
|
291 |
#### All Hyperparameters
|
|
|
295 |
- `do_predict`: False
|
296 |
- `eval_strategy`: epoch
|
297 |
- `prediction_loss_only`: True
|
298 |
+
- `per_device_train_batch_size`: 64
|
299 |
+
- `per_device_eval_batch_size`: 64
|
300 |
- `per_gpu_train_batch_size`: None
|
301 |
- `per_gpu_eval_batch_size`: None
|
302 |
- `gradient_accumulation_steps`: 1
|
303 |
- `eval_accumulation_steps`: None
|
304 |
- `torch_empty_cache_steps`: None
|
305 |
+
- `learning_rate`: 8e-06
|
306 |
- `weight_decay`: 0.0
|
307 |
- `adam_beta1`: 0.9
|
308 |
- `adam_beta2`: 0.999
|
|
|
348 |
- `disable_tqdm`: False
|
349 |
- `remove_unused_columns`: True
|
350 |
- `label_names`: None
|
351 |
+
- `load_best_model_at_end`: True
|
352 |
- `ignore_data_skip`: False
|
353 |
- `fsdp`: []
|
354 |
- `fsdp_min_num_params`: 0
|
|
|
410 |
</details>
|
411 |
|
412 |
### Training Logs
|
413 |
+
| Epoch | Step | Training Loss | Validation Loss | accuracy |
|
414 |
+
|:-------:|:---------:|:-------------:|:---------------:|:--------:|
|
415 |
+
| 1.0 | 15192 | 1.3895 | - | - |
|
416 |
+
| 0 | 0 | - | - | 0.7882 |
|
417 |
+
| 1.0 | 15192 | - | 1.0425 | - |
|
418 |
+
| 2.0 | 30384 | 0.9597 | - | - |
|
419 |
+
| 0 | 0 | - | - | 0.7972 |
|
420 |
+
| 2.0 | 30384 | - | 1.0071 | - |
|
421 |
+
| **3.0** | **45576** | **0.8756** | **-** | **-** |
|
422 |
+
| 0 | 0 | - | - | 0.7979 |
|
423 |
+
| **3.0** | **45576** | **-** | **1.0083** | **-** |
|
424 |
+
| 4.0 | 60768 | 0.8355 | - | - |
|
425 |
+
| 0 | 0 | - | - | 0.7978 |
|
426 |
+
| 4.0 | 60768 | - | 1.0145 | - |
|
427 |
+
| 0 | 0 | - | - | 0.7980 |
|
428 |
+
|
429 |
+
* The bold row denotes the saved checkpoint.
|
430 |
|
431 |
### Framework Versions
|
432 |
- Python: 3.11.11
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 67195976
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1429758048822d9518adc2293d213d9bef2d7e19de275343a7c9db9f34291a5b
|
3 |
size 67195976
|