File size: 28,610 Bytes
05a3140
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
---
base_model: sentence-transformers/paraphrase-MiniLM-L3-v2
datasets: []
language: []
library_name: sentence-transformers
metrics:
- cosine_accuracy
- dot_accuracy
- manhattan_accuracy
- euclidean_accuracy
- max_accuracy
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:2000
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: However, this Court will determine that there was sufficient evidence
    to sustain the jury's verdict if the evidence was "of such quality and weight
    that, having in mind the beyond a reasonable doubt burden of proof standard, reasonable
    fair-minded men in the exercise of impartial judgment might reach different conclusions
    on every element of the offense."
  sentences:
  - This Court will determine if there was enough evidence to support the jury's verdict
    by considering whether reasonable people could have reached different conclusions
    based on the evidence presented.
  - The VA psychiatrist believed that the Veteran was likely to have PTSD as a direct
    result of the attack on him during his military service in Korea.
  - The Veteran started seeing a mental health specialist at the VA on a regular basis.
- source_sentence: Under such circumstances, VA is required to prove by clear and
    unmistakable evidence that a disease or injury manifesting in service both preexisted
    service and was not aggravated by service.
  sentences:
  - The independent mental health expert offered a comprehensive account of the Veteran's
    mental health issues, service-related impairments, and previous psychiatric and
    medical treatment experiences.
  - At the trial, the prosecution failed to provide a search warrant, which was not
    explained or justified.
  - In order to establish that a disease or injury did not arise from service, VA
    must provide clear and convincing evidence that the condition existed prior to
    military service and was not exacerbated by service.
- source_sentence: Evidence of behavior changes following the claimed assault is one
    type of relevant evidence that may be found in these sources.
  sentences:
  - The independent medical clinician comprehensively documented the impact of the
    Veteran's alleged condition on their functional abilities.
  - A range of behavioral indicators, including alterations in demeanor, speech patterns,
    and physical reactions, can serve as valuable evidence in support of allegations
    of assault.
  - He claims that his mental health issues, which have been diagnosed as various
    psychiatric disorders, are a result of the trauma he experienced during his deployment
    to a combat zone in Vietnam while stationed in Japan in 1974.
- source_sentence: The court held Apple had not made the requisite showing of likelihood
    of success on the merits because it “concluded that there is some doubt as to
    the copyrightability of the programs described in this litigation.”
  sentences:
  - The trial court committed a series of errors in this case, including failing to
    instruct the jury on an essential element of felonious damage to computers, denying
    the defendant's motion to dismiss, and entering judgment on a fatally flawed indictment.
  - The court determined that Apple had not provided sufficient evidence to demonstrate
    a likelihood of success on the merits, as it had "raised some doubts about the
    copyrightability of the programs in question."
  - The Veteran believes that she should be granted service connection for chronic
    PTSD or other psychiatric disorder because she has been diagnosed with chronic
    PTSD as a result of several stressful events that occurred during her periods
    of active duty and active duty for training with the Army National Guard.
- source_sentence: In contrast, the scope of punishable conduct under the instant
    statute is limited by the individual's specified intent to "haras[s]" by communicating
    a "threat" so as to "engage in a knowing and willful course of conduct" directed
    at the victim that "alarms, torments, or terrorizes" the victim.
  sentences:
  - The scope of punishable conduct under the statute is limited to the individual's
    intent to harass by communicating a threat so as to engage in a knowing and willful
    course of conduct directed at the victim that alarms, torments, or terrorizes
    the victim.
  - The Veteran has been diagnosed with both major depressive disorder and PTSD.
  - The trial court's decision on an anti-SLAPP motion is subject to de novo review.
model-index:
- name: SentenceTransformer based on sentence-transformers/paraphrase-MiniLM-L3-v2
  results:
  - task:
      type: triplet
      name: Triplet
    dataset:
      name: all nli dev
      type: all-nli-dev
    metrics:
    - type: cosine_accuracy
      value: 1.0
      name: Cosine Accuracy
    - type: dot_accuracy
      value: 0.0
      name: Dot Accuracy
    - type: manhattan_accuracy
      value: 1.0
      name: Manhattan Accuracy
    - type: euclidean_accuracy
      value: 1.0
      name: Euclidean Accuracy
    - type: max_accuracy
      value: 1.0
      name: Max Accuracy
  - task:
      type: triplet
      name: Triplet
    dataset:
      name: all nli test
      type: all-nli-test
    metrics:
    - type: cosine_accuracy
      value: 1.0
      name: Cosine Accuracy
    - type: dot_accuracy
      value: 0.0
      name: Dot Accuracy
    - type: manhattan_accuracy
      value: 1.0
      name: Manhattan Accuracy
    - type: euclidean_accuracy
      value: 1.0
      name: Euclidean Accuracy
    - type: max_accuracy
      value: 1.0
      name: Max Accuracy
    - type: cosine_accuracy
      value: 1.0
      name: Cosine Accuracy
    - type: dot_accuracy
      value: 0.0
      name: Dot Accuracy
    - type: manhattan_accuracy
      value: 1.0
      name: Manhattan Accuracy
    - type: euclidean_accuracy
      value: 1.0
      name: Euclidean Accuracy
    - type: max_accuracy
      value: 1.0
      name: Max Accuracy
---

# SentenceTransformer based on sentence-transformers/paraphrase-MiniLM-L3-v2

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-MiniLM-L3-v2](https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L3-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [sentence-transformers/paraphrase-MiniLM-L3-v2](https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L3-v2) <!-- at revision 54825a6a5a83f5d98d318ba2a11bfd31eb906f06 -->
- **Maximum Sequence Length:** 128 tokens
- **Output Dimensionality:** 384 tokens
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("justArmenian/legal_paraphrase")
# Run inference
sentences = [
    'In contrast, the scope of punishable conduct under the instant statute is limited by the individual\'s specified intent to "haras[s]" by communicating a "threat" so as to "engage in a knowing and willful course of conduct" directed at the victim that "alarms, torments, or terrorizes" the victim.',
    "The scope of punishable conduct under the statute is limited to the individual's intent to harass by communicating a threat so as to engage in a knowing and willful course of conduct directed at the victim that alarms, torments, or terrorizes the victim.",
    'The Veteran has been diagnosed with both major depressive disorder and PTSD.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Triplet
* Dataset: `all-nli-dev`
* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)

| Metric             | Value   |
|:-------------------|:--------|
| cosine_accuracy    | 1.0     |
| dot_accuracy       | 0.0     |
| manhattan_accuracy | 1.0     |
| euclidean_accuracy | 1.0     |
| **max_accuracy**   | **1.0** |

#### Triplet
* Dataset: `all-nli-test`
* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)

| Metric             | Value   |
|:-------------------|:--------|
| cosine_accuracy    | 1.0     |
| dot_accuracy       | 0.0     |
| manhattan_accuracy | 1.0     |
| euclidean_accuracy | 1.0     |
| **max_accuracy**   | **1.0** |

#### Triplet
* Dataset: `all-nli-test`
* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)

| Metric             | Value   |
|:-------------------|:--------|
| cosine_accuracy    | 1.0     |
| dot_accuracy       | 0.0     |
| manhattan_accuracy | 1.0     |
| euclidean_accuracy | 1.0     |
| **max_accuracy**   | **1.0** |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### Unnamed Dataset


* Size: 2,000 training samples
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
* Approximate statistics based on the first 1000 samples:
  |         | anchor                                                                             | positive                                                                          | negative                                                                          |
  |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
  | type    | string                                                                             | string                                                                            | string                                                                            |
  | details | <ul><li>min: 8 tokens</li><li>mean: 36.01 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 31.41 tokens</li><li>max: 99 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 31.39 tokens</li><li>max: 99 tokens</li></ul> |
* Samples:
  | anchor                                                                                                                                                                                                                                                         | positive                                                                                                                                                                                                                    | negative                                                                                                                                                                                                                    |
  |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>The weight of the competent and probative medical opinions of record is against finding that the Veteran has a current diagnosis of PTSD or any other chronic acquired psychiatric disorder which is related to her military service.</code>             | <code>The weight of the credible and persuasive medical evidence on record suggests that the Veteran does not currently suffer from PTSD or any other chronic psychiatric condition related to her military service.</code> | <code>It is evident that an unauthorized physical intrusion would have been deemed a "search" under the Fourth Amendment when it was originally formulated.</code>                                                          |
  | <code>We have no doubt that such a physical intrusion would have been considered a “search” within the meaning of the Fourth Amendment when it was adopted.</code>                                                                                             | <code>It is evident that an unauthorized physical intrusion would have been deemed a "search" under the Fourth Amendment when it was originally formulated.</code>                                                          | <code>In June 1972, the Veteran's condition was assessed by the Army Medical Board, which concluded that the Veteran's back condition made him unfit for active service, leading to his discharge from the military.</code> |
  | <code>Later in June 1972, the Veteran's condition was evaluated by the Army Medical Board, where it was determined that the Veteran's back condition rendered him physically unfit for active service, and he was subsequently discharged from service.</code> | <code>In June 1972, the Veteran's condition was assessed by the Army Medical Board, which concluded that the Veteran's back condition made him unfit for active service, leading to his discharge from the military.</code> | <code>The court has granted a petition for a writ of certiorari to review a decision made by the Court of Appeal of California, Fourth Appellate District, Division One.</code>                                             |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim"
  }
  ```

### Evaluation Dataset

#### Unnamed Dataset


* Size: 500 evaluation samples
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
* Approximate statistics based on the first 1000 samples:
  |         | anchor                                                                             | positive                                                                          | negative                                                                          |
  |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
  | type    | string                                                                             | string                                                                            | string                                                                            |
  | details | <ul><li>min: 8 tokens</li><li>mean: 35.69 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 32.11 tokens</li><li>max: 77 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 32.12 tokens</li><li>max: 77 tokens</li></ul> |
* Samples:
  | anchor                                                                                                                                                                                                                                                                                                                                                                             | positive                                                                                                                                                                                                             | negative                                                                                                                                                                                                             |
  |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>(Virginia v. Black, supra, 538 U.S. at p. 347.)</code>                                                                                                                                                                                                                                                                                                                       | <code>The Black Court asserted that the "vagueness doctrine is a safeguard against the arbitrary exercise of power by government officials."</code>                                                                  | <code>This Court will determine if there was enough evidence to support the jury's verdict by considering whether reasonable people could have reached different conclusions based on the evidence presented.</code> |
  | <code>However, this Court will determine that there was sufficient evidence to sustain the jury's verdict if the evidence was "of such quality and weight that, having in mind the beyond a reasonable doubt burden of proof standard, reasonable fair-minded men in the exercise of impartial judgment might reach different conclusions on every element of the offense."</code> | <code>This Court will determine if there was enough evidence to support the jury's verdict by considering whether reasonable people could have reached different conclusions based on the evidence presented.</code> | <code>The VA psychiatrist believed that the Veteran was likely to have PTSD as a direct result of the attack on him during his military service in Korea.</code>                                                     |
  | <code>This VA psychiatrist opined that the Veteran had PTSD more likely than not to be the direct result of the attack on him during service in Korea.</code>                                                                                                                                                                                                                      | <code>The VA psychiatrist believed that the Veteran was likely to have PTSD as a direct result of the attack on him during his military service in Korea.</code>                                                     | <code>She noted that the Veteran's greatest source of stress was caring for their adult child without any assistance.</code>                                                                                         |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim"
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `num_train_epochs`: 1
- `warmup_ratio`: 0.1
- `fp16`: True
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 1
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional

</details>

### Training Logs
| Epoch | Step | Training Loss | loss   | all-nli-dev_max_accuracy | all-nli-test_max_accuracy |
|:-----:|:----:|:-------------:|:------:|:------------------------:|:-------------------------:|
| 0     | 0    | -             | -      | 1.0                      | -                         |
| 0.08  | 10   | 0.1402        | 0.0759 | 1.0                      | -                         |
| 0.16  | 20   | 0.0873        | 0.0726 | 1.0                      | -                         |
| 0.24  | 30   | 0.0992        | 0.0677 | 1.0                      | -                         |
| 0.32  | 40   | 0.0759        | 0.0651 | 1.0                      | -                         |
| 0.4   | 50   | 0.0355        | 0.0652 | 1.0                      | -                         |
| 0.48  | 60   | 0.0814        | 0.0666 | 1.0                      | -                         |
| 0.56  | 70   | 0.0353        | 0.0677 | 1.0                      | -                         |
| 0.64  | 80   | 0.1404        | 0.0677 | 1.0                      | -                         |
| 0.72  | 90   | 0.0336        | 0.0664 | 1.0                      | -                         |
| 0.8   | 100  | 0.0559        | 0.0661 | 1.0                      | -                         |
| 0.88  | 110  | 0.0484        | 0.0654 | 1.0                      | -                         |
| 0.96  | 120  | 0.0522        | 0.0650 | 1.0                      | -                         |
| 1.0   | 125  | -             | -      | -                        | 1.0                       |


### Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.42.4
- PyTorch: 2.3.1+cu121
- Accelerate: 0.32.1
- Datasets: 2.20.0
- Tokenizers: 0.19.1

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->