ModernBERT Embed base Legal Matryoshka
This is a sentence-transformers model finetuned from intfloat/multilingual-e5-large on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: intfloat/multilingual-e5-large
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("IoannisKat1/multilingual-e5-large-legal-matryoshka")
sentences = [
'How long is the period that can be added due to the complexity of the subject-matter?',
'1.In order to ensure the correct and consistent application of this Regulation in individual cases, the Board shall adopt a binding decision in the following cases: (a) where, in a case referred to in Article 60(4), a supervisory authority concerned has raised a relevant and reasoned objection to a draft decision of the lead authority or the lead authority has rejected such an objection as being not relevant or reasoned. The binding decision shall concern all the matters which are the subject of the relevant and reasoned objection, in particular whether there is an infringement of this Regulation; 4.5.2016 L 119/74 (b) where there are conflicting views on which of the supervisory authorities concerned is competent for the main establishment; (c) where a competent supervisory authority does not request the opinion of the Board in the cases referred to in Article 64(1), or does not follow the opinion of the Board issued under Article 64. In that case, any supervisory authority concerned or the Commission may communicate the matter to the Board.\n2.The decision referred to in paragraph 1 shall be adopted within one month from the referral of the subject-matter by a two-thirds majority of the members of the Board. That period may be extended by a further month on account of the complexity of the subject-matter. The decision referred to in paragraph 1 shall be reasoned and addressed to the lead supervisory authority and all the supervisory authorities concerned and binding on them.\n3.Where the Board has been unable to adopt a decision within the periods referred to in paragraph 2, it shall adopt its decision within two weeks following the expiration of the second month referred to in paragraph 2 by a simple majority of the members of the Board. Where the members of the Board are split, the decision shall by adopted by the vote of its Chair.\n4.The supervisory authorities concerned shall not adopt a decision on the subject matter submitted to the Board under paragraph 1 during the periods referred to in paragraphs 2 and 3\n5.The Chair of the Board shall notify, without undue delay, the decision referred to in paragraph 1 to the supervisory authorities concerned. It shall inform the Commission thereof. The decision shall be published on the website of the Board without delay after the supervisory authority has notified the final decision referred to in paragraph 6\n6.The lead supervisory authority or, as the case may be, the supervisory authority with which the complaint has been lodged shall adopt its final decision on the basis of the decision referred to in paragraph 1 of this Article, without undue delay and at the latest by one month after the Board has notified its decision. The lead supervisory authority or, as the case may be, the supervisory authority with which the complaint has been lodged, shall inform the Board of the date when its final decision is notified respectively to the controller or the processor and to the data subject. The final decision of the supervisory authorities concerned shall be adopted under the terms of Article 60(7), (8) and (9). The final decision shall refer to the decision referred to in paragraph 1 of this Article and shall specify that the decision referred to in that paragraph will be published on the website of the Board in accordance with paragraph 5 of this Article. The final decision shall attach the decision referred to in paragraph 1 of this Article.',
"Each supervisory authority not acting as the lead supervisory authority should be competent to handle local cases where the controller or processor is established in more than one Member State, but the subject matter of the specific processing concerns only processing carried out in a single Member State and involves only data subjects in that single Member State, for example, where the subject matter concerns the processing of employees' personal data in the specific employment context of a Member State. In such cases, the supervisory authority should inform the lead supervisory authority without delay about the matter. After being informed, the lead supervisory authority should decide, whether it will handle the case pursuant to the provision on cooperation between the lead supervisory authority and other supervisory authorities concerned (‘one-stop-shop mechanism’), or whether the supervisory authority which informed it should handle the case at local level. When deciding whether it will handle the case, the lead supervisory authority should take into account whether there is an establishment of the controller or processor in the Member State of the supervisory authority which informed it in order to ensure effective enforcement of a decision vis-à-vis the controller or processor. Where the lead supervisory authority decides to handle the case, the supervisory authority which informed it should have the 4.5.2016 L 119/23 Official Journal of the European Union EN possibility to submit a draft for a decision, of which the lead supervisory authority should take utmost account when preparing its draft decision in that one-stop-shop mechanism.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
Evaluation
Metrics
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.4798 |
cosine_accuracy@3 |
0.5101 |
cosine_accuracy@5 |
0.5429 |
cosine_accuracy@10 |
0.5833 |
cosine_precision@1 |
0.4798 |
cosine_precision@3 |
0.463 |
cosine_precision@5 |
0.4343 |
cosine_precision@10 |
0.3833 |
cosine_recall@1 |
0.0927 |
cosine_recall@3 |
0.2289 |
cosine_recall@5 |
0.3077 |
cosine_recall@10 |
0.4266 |
cosine_ndcg@10 |
0.5293 |
cosine_mrr@10 |
0.5031 |
cosine_map@100 |
0.5845 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.4848 |
cosine_accuracy@3 |
0.5101 |
cosine_accuracy@5 |
0.5404 |
cosine_accuracy@10 |
0.5884 |
cosine_precision@1 |
0.4848 |
cosine_precision@3 |
0.4663 |
cosine_precision@5 |
0.4348 |
cosine_precision@10 |
0.3843 |
cosine_recall@1 |
0.0934 |
cosine_recall@3 |
0.2314 |
cosine_recall@5 |
0.3103 |
cosine_recall@10 |
0.4291 |
cosine_ndcg@10 |
0.5319 |
cosine_mrr@10 |
0.507 |
cosine_map@100 |
0.5864 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.4874 |
cosine_accuracy@3 |
0.5126 |
cosine_accuracy@5 |
0.5354 |
cosine_accuracy@10 |
0.5758 |
cosine_precision@1 |
0.4874 |
cosine_precision@3 |
0.468 |
cosine_precision@5 |
0.4338 |
cosine_precision@10 |
0.3768 |
cosine_recall@1 |
0.0944 |
cosine_recall@3 |
0.2337 |
cosine_recall@5 |
0.3123 |
cosine_recall@10 |
0.4242 |
cosine_ndcg@10 |
0.5283 |
cosine_mrr@10 |
0.507 |
cosine_map@100 |
0.5822 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.4798 |
cosine_accuracy@3 |
0.5076 |
cosine_accuracy@5 |
0.5404 |
cosine_accuracy@10 |
0.5884 |
cosine_precision@1 |
0.4798 |
cosine_precision@3 |
0.4621 |
cosine_precision@5 |
0.4379 |
cosine_precision@10 |
0.3912 |
cosine_recall@1 |
0.0886 |
cosine_recall@3 |
0.2158 |
cosine_recall@5 |
0.2945 |
cosine_recall@10 |
0.4202 |
cosine_ndcg@10 |
0.5299 |
cosine_mrr@10 |
0.5031 |
cosine_map@100 |
0.5755 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.4798 |
cosine_accuracy@3 |
0.4975 |
cosine_accuracy@5 |
0.5328 |
cosine_accuracy@10 |
0.5732 |
cosine_precision@1 |
0.4798 |
cosine_precision@3 |
0.4596 |
cosine_precision@5 |
0.4318 |
cosine_precision@10 |
0.379 |
cosine_recall@1 |
0.0901 |
cosine_recall@3 |
0.2197 |
cosine_recall@5 |
0.298 |
cosine_recall@10 |
0.4068 |
cosine_ndcg@10 |
0.5199 |
cosine_mrr@10 |
0.4994 |
cosine_map@100 |
0.5701 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.4419 |
cosine_accuracy@3 |
0.4672 |
cosine_accuracy@5 |
0.5076 |
cosine_accuracy@10 |
0.5631 |
cosine_precision@1 |
0.4419 |
cosine_precision@3 |
0.4251 |
cosine_precision@5 |
0.4005 |
cosine_precision@10 |
0.3588 |
cosine_recall@1 |
0.0834 |
cosine_recall@3 |
0.2079 |
cosine_recall@5 |
0.2837 |
cosine_recall@10 |
0.3959 |
cosine_ndcg@10 |
0.4938 |
cosine_mrr@10 |
0.4664 |
cosine_map@100 |
0.5457 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 1,580 training samples
- Columns:
anchor
and positive
- Approximate statistics based on the first 1000 samples:
|
anchor |
positive |
type |
string |
string |
details |
- min: 7 tokens
- mean: 17.31 tokens
- max: 39 tokens
|
- min: 27 tokens
- mean: 381.69 tokens
- max: 512 tokens
|
- Samples:
anchor |
positive |
What is required when reproducing or using a work according to Article 60 of this Law? |
Any person who, in contravention of the provisions of this law or of the provisions of lawfully ratified multilateral international conventions on the protection of copyright, unlawfully makes a fixation of a work or of copies, reproduces them directly or indirectly, temporarily or permanently in any form, in whole or in part, translates, adapts, alters or transforms them, or distributes them to the public by sale or other means, or possesses with the intent of distributing them, rents, performs in public, broadcasts by radio or television or any other means, communicates to the public works or copies by any means, imports copies of a work illegally produced abroad without the consent of the author and, in general, exploits works, reproductions or copies being the object of copyright or acts against the moral right of the author to decide freely on the publication and the presentation of his work to the public without additions or deletions, shall be liable to imprisonment of no less t... |
Who electronically sent the request? |
Court (Civil/Criminal): Civil
Provisions:
Time of commission of the act:
Outcome (not guilty, guilty):
Rationale:
Facts: The plaintiff holds credit card number ............ with the defendant banking corporation. Based on the application for alternative networks dated 19/7/2015 with number ......... submitted at a branch of the defendant, he was granted access to the electronic banking service (e-banking) to conduct banking transactions (debit, credit, updates, payments) remotely. On 30/11/2020, the plaintiff fell victim to electronic fraud through the "phishing" method, whereby an unknown perpetrator managed to withdraw a total amount of €3,121.75 from the aforementioned credit card. Specifically, the plaintiff received an email at 1:35 PM on 29/11/2020 from sender ...... with address ........, informing him that due to an impending system change, he needed to verify the mobile phone number linked to the credit card, urging him to complete the verification... |
What right should not imply the erasure of personal data needed for a contract? |
To further strengthen the control over his or her own data, where the processing of personal data is carried out by automated means, the data subject should also be allowed to receive personal data concerning him or her which he or she has provided to a controller in a structured, commonly used, machine-readable and interoperable format, and to transmit it to another controller. Data controllers should be encouraged to develop interoperable formats that enable data portability. That right should apply where the data subject provided the personal data on the basis of his or her consent or the processing is necessary for the performance of a contract. It should not apply where processing is based on a legal ground other than consent or contract. By its very nature, that right should not be exercised against controllers processing personal data in the exercise of their public duties. It should therefore not apply where the processing of the personal data is necessary for compliance with a... |
- Loss:
MatryoshkaLoss
with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
1024,
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epoch
gradient_accumulation_steps
: 2
learning_rate
: 2e-05
num_train_epochs
: 15
lr_scheduler_type
: cosine
warmup_ratio
: 0.1
bf16
: True
tf32
: True
load_best_model_at_end
: True
optim
: adamw_torch_fused
batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: False
do_predict
: False
eval_strategy
: epoch
prediction_loss_only
: True
per_device_train_batch_size
: 8
per_device_eval_batch_size
: 8
per_gpu_train_batch_size
: None
per_gpu_eval_batch_size
: None
gradient_accumulation_steps
: 2
eval_accumulation_steps
: None
torch_empty_cache_steps
: None
learning_rate
: 2e-05
weight_decay
: 0.0
adam_beta1
: 0.9
adam_beta2
: 0.999
adam_epsilon
: 1e-08
max_grad_norm
: 1.0
num_train_epochs
: 15
max_steps
: -1
lr_scheduler_type
: cosine
lr_scheduler_kwargs
: {}
warmup_ratio
: 0.1
warmup_steps
: 0
log_level
: passive
log_level_replica
: warning
log_on_each_node
: True
logging_nan_inf_filter
: True
save_safetensors
: True
save_on_each_node
: False
save_only_model
: False
restore_callback_states_from_checkpoint
: False
no_cuda
: False
use_cpu
: False
use_mps_device
: False
seed
: 42
data_seed
: None
jit_mode_eval
: False
use_ipex
: False
bf16
: True
fp16
: False
fp16_opt_level
: O1
half_precision_backend
: auto
bf16_full_eval
: False
fp16_full_eval
: False
tf32
: True
local_rank
: 0
ddp_backend
: None
tpu_num_cores
: None
tpu_metrics_debug
: False
debug
: []
dataloader_drop_last
: False
dataloader_num_workers
: 0
dataloader_prefetch_factor
: None
past_index
: -1
disable_tqdm
: False
remove_unused_columns
: True
label_names
: None
load_best_model_at_end
: True
ignore_data_skip
: False
fsdp
: []
fsdp_min_num_params
: 0
fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
tp_size
: 0
fsdp_transformer_layer_cls_to_wrap
: None
accelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed
: None
label_smoothing_factor
: 0.0
optim
: adamw_torch_fused
optim_args
: None
adafactor
: False
group_by_length
: False
length_column_name
: length
ddp_find_unused_parameters
: None
ddp_bucket_cap_mb
: None
ddp_broadcast_buffers
: False
dataloader_pin_memory
: True
dataloader_persistent_workers
: False
skip_memory_metrics
: True
use_legacy_prediction_loop
: False
push_to_hub
: False
resume_from_checkpoint
: None
hub_model_id
: None
hub_strategy
: every_save
hub_private_repo
: None
hub_always_push
: False
gradient_checkpointing
: False
gradient_checkpointing_kwargs
: None
include_inputs_for_metrics
: False
include_for_metrics
: []
eval_do_concat_batches
: True
fp16_backend
: auto
push_to_hub_model_id
: None
push_to_hub_organization
: None
mp_parameters
:
auto_find_batch_size
: False
full_determinism
: False
torchdynamo
: None
ray_scope
: last
ddp_timeout
: 1800
torch_compile
: False
torch_compile_backend
: None
torch_compile_mode
: None
include_tokens_per_second
: False
include_num_input_tokens_seen
: False
neftune_noise_alpha
: None
optim_target_modules
: None
batch_eval_metrics
: False
eval_on_start
: False
use_liger_kernel
: False
eval_use_gather_object
: False
average_tokens_across_devices
: False
prompts
: None
batch_sampler
: no_duplicates
multi_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch |
Step |
Training Loss |
dim_1024_cosine_ndcg@10 |
dim_768_cosine_ndcg@10 |
dim_512_cosine_ndcg@10 |
dim_256_cosine_ndcg@10 |
dim_128_cosine_ndcg@10 |
dim_64_cosine_ndcg@10 |
0.1010 |
10 |
15.4718 |
- |
- |
- |
- |
- |
- |
0.2020 |
20 |
14.6553 |
- |
- |
- |
- |
- |
- |
0.3030 |
30 |
13.0692 |
- |
- |
- |
- |
- |
- |
0.4040 |
40 |
11.9374 |
- |
- |
- |
- |
- |
- |
0.5051 |
50 |
10.391 |
- |
- |
- |
- |
- |
- |
0.6061 |
60 |
9.4382 |
- |
- |
- |
- |
- |
- |
0.7071 |
70 |
10.3454 |
- |
- |
- |
- |
- |
- |
0.8081 |
80 |
8.2018 |
- |
- |
- |
- |
- |
- |
0.9091 |
90 |
8.5864 |
- |
- |
- |
- |
- |
- |
1.0 |
99 |
- |
0.4460 |
0.4412 |
0.4274 |
0.4195 |
0.3848 |
0.3259 |
1.0101 |
100 |
8.6625 |
- |
- |
- |
- |
- |
- |
1.1111 |
110 |
4.7502 |
- |
- |
- |
- |
- |
- |
1.2121 |
120 |
6.2492 |
- |
- |
- |
- |
- |
- |
1.3131 |
130 |
4.8127 |
- |
- |
- |
- |
- |
- |
1.4141 |
140 |
6.1704 |
- |
- |
- |
- |
- |
- |
1.5152 |
150 |
6.6274 |
- |
- |
- |
- |
- |
- |
1.6162 |
160 |
5.399 |
- |
- |
- |
- |
- |
- |
1.7172 |
170 |
7.426 |
- |
- |
- |
- |
- |
- |
1.8182 |
180 |
7.1311 |
- |
- |
- |
- |
- |
- |
1.9192 |
190 |
5.946 |
- |
- |
- |
- |
- |
- |
2.0 |
198 |
- |
0.4016 |
0.4013 |
0.4014 |
0.4000 |
0.3593 |
0.3004 |
2.0202 |
200 |
6.2856 |
- |
- |
- |
- |
- |
- |
2.1212 |
210 |
3.5988 |
- |
- |
- |
- |
- |
- |
2.2222 |
220 |
4.2901 |
- |
- |
- |
- |
- |
- |
2.3232 |
230 |
3.1813 |
- |
- |
- |
- |
- |
- |
2.4242 |
240 |
3.5008 |
- |
- |
- |
- |
- |
- |
2.5253 |
250 |
5.1129 |
- |
- |
- |
- |
- |
- |
2.6263 |
260 |
4.2109 |
- |
- |
- |
- |
- |
- |
2.7273 |
270 |
3.6912 |
- |
- |
- |
- |
- |
- |
2.8283 |
280 |
3.3291 |
- |
- |
- |
- |
- |
- |
2.9293 |
290 |
4.4076 |
- |
- |
- |
- |
- |
- |
3.0 |
297 |
- |
0.4627 |
0.4581 |
0.4402 |
0.4497 |
0.4050 |
0.3634 |
3.0303 |
300 |
4.7613 |
- |
- |
- |
- |
- |
- |
3.1313 |
310 |
2.8859 |
- |
- |
- |
- |
- |
- |
3.2323 |
320 |
2.3837 |
- |
- |
- |
- |
- |
- |
3.3333 |
330 |
2.3918 |
- |
- |
- |
- |
- |
- |
3.4343 |
340 |
3.2392 |
- |
- |
- |
- |
- |
- |
3.5354 |
350 |
3.0618 |
- |
- |
- |
- |
- |
- |
3.6364 |
360 |
3.4849 |
- |
- |
- |
- |
- |
- |
3.7374 |
370 |
2.9544 |
- |
- |
- |
- |
- |
- |
3.8384 |
380 |
3.7552 |
- |
- |
- |
- |
- |
- |
3.9394 |
390 |
2.8886 |
- |
- |
- |
- |
- |
- |
4.0 |
396 |
- |
0.4759 |
0.4672 |
0.4716 |
0.4775 |
0.4500 |
0.4110 |
4.0404 |
400 |
2.6625 |
- |
- |
- |
- |
- |
- |
4.1414 |
410 |
1.4966 |
- |
- |
- |
- |
- |
- |
4.2424 |
420 |
1.7481 |
- |
- |
- |
- |
- |
- |
4.3434 |
430 |
1.854 |
- |
- |
- |
- |
- |
- |
4.4444 |
440 |
2.7405 |
- |
- |
- |
- |
- |
- |
4.5455 |
450 |
2.1245 |
- |
- |
- |
- |
- |
- |
4.6465 |
460 |
2.2464 |
- |
- |
- |
- |
- |
- |
4.7475 |
470 |
3.2942 |
- |
- |
- |
- |
- |
- |
4.8485 |
480 |
3.0603 |
- |
- |
- |
- |
- |
- |
4.9495 |
490 |
2.2054 |
- |
- |
- |
- |
- |
- |
5.0 |
495 |
- |
0.4999 |
0.4997 |
0.4861 |
0.4865 |
0.4489 |
0.4270 |
5.0505 |
500 |
2.0539 |
- |
- |
- |
- |
- |
- |
5.1515 |
510 |
1.715 |
- |
- |
- |
- |
- |
- |
5.2525 |
520 |
1.214 |
- |
- |
- |
- |
- |
- |
5.3535 |
530 |
1.7161 |
- |
- |
- |
- |
- |
- |
5.4545 |
540 |
1.8882 |
- |
- |
- |
- |
- |
- |
5.5556 |
550 |
2.0362 |
- |
- |
- |
- |
- |
- |
5.6566 |
560 |
1.7486 |
- |
- |
- |
- |
- |
- |
5.7576 |
570 |
2.1613 |
- |
- |
- |
- |
- |
- |
5.8586 |
580 |
1.4125 |
- |
- |
- |
- |
- |
- |
5.9596 |
590 |
2.3426 |
- |
- |
- |
- |
- |
- |
6.0 |
594 |
- |
0.5096 |
0.5053 |
0.4923 |
0.5008 |
0.4707 |
0.4483 |
6.0606 |
600 |
1.3931 |
- |
- |
- |
- |
- |
- |
6.1616 |
610 |
1.5685 |
- |
- |
- |
- |
- |
- |
6.2626 |
620 |
0.6951 |
- |
- |
- |
- |
- |
- |
6.3636 |
630 |
2.4788 |
- |
- |
- |
- |
- |
- |
6.4646 |
640 |
1.8925 |
- |
- |
- |
- |
- |
- |
6.5657 |
650 |
1.3563 |
- |
- |
- |
- |
- |
- |
6.6667 |
660 |
0.8888 |
- |
- |
- |
- |
- |
- |
6.7677 |
670 |
1.1035 |
- |
- |
- |
- |
- |
- |
6.8687 |
680 |
1.2746 |
- |
- |
- |
- |
- |
- |
6.9697 |
690 |
1.3126 |
- |
- |
- |
- |
- |
- |
7.0 |
693 |
- |
0.5047 |
0.5048 |
0.4941 |
0.4856 |
0.4623 |
0.4286 |
7.0707 |
700 |
1.1118 |
- |
- |
- |
- |
- |
- |
7.1717 |
710 |
0.5125 |
- |
- |
- |
- |
- |
- |
7.2727 |
720 |
1.1709 |
- |
- |
- |
- |
- |
- |
7.3737 |
730 |
1.4081 |
- |
- |
- |
- |
- |
- |
7.4747 |
740 |
1.2162 |
- |
- |
- |
- |
- |
- |
7.5758 |
750 |
1.5649 |
- |
- |
- |
- |
- |
- |
7.6768 |
760 |
1.1022 |
- |
- |
- |
- |
- |
- |
7.7778 |
770 |
1.1308 |
- |
- |
- |
- |
- |
- |
7.8788 |
780 |
1.4802 |
- |
- |
- |
- |
- |
- |
7.9798 |
790 |
0.789 |
- |
- |
- |
- |
- |
- |
8.0 |
792 |
- |
0.5137 |
0.5091 |
0.5177 |
0.5103 |
0.4781 |
0.4697 |
8.0808 |
800 |
0.9737 |
- |
- |
- |
- |
- |
- |
8.1818 |
810 |
1.1912 |
- |
- |
- |
- |
- |
- |
8.2828 |
820 |
0.3296 |
- |
- |
- |
- |
- |
- |
8.3838 |
830 |
0.8433 |
- |
- |
- |
- |
- |
- |
8.4848 |
840 |
0.7217 |
- |
- |
- |
- |
- |
- |
8.5859 |
850 |
1.002 |
- |
- |
- |
- |
- |
- |
8.6869 |
860 |
1.0049 |
- |
- |
- |
- |
- |
- |
8.7879 |
870 |
0.2311 |
- |
- |
- |
- |
- |
- |
8.8889 |
880 |
0.3288 |
- |
- |
- |
- |
- |
- |
8.9899 |
890 |
0.4877 |
- |
- |
- |
- |
- |
- |
9.0 |
891 |
- |
0.5116 |
0.5005 |
0.5073 |
0.5034 |
0.4832 |
0.4694 |
9.0909 |
900 |
0.9198 |
- |
- |
- |
- |
- |
- |
9.1919 |
910 |
0.3942 |
- |
- |
- |
- |
- |
- |
9.2929 |
920 |
0.6227 |
- |
- |
- |
- |
- |
- |
9.3939 |
930 |
0.4507 |
- |
- |
- |
- |
- |
- |
9.4949 |
940 |
0.5001 |
- |
- |
- |
- |
- |
- |
9.5960 |
950 |
0.747 |
- |
- |
- |
- |
- |
- |
9.6970 |
960 |
1.1764 |
- |
- |
- |
- |
- |
- |
9.7980 |
970 |
1.0748 |
- |
- |
- |
- |
- |
- |
9.8990 |
980 |
0.3899 |
- |
- |
- |
- |
- |
- |
10.0 |
990 |
0.3206 |
0.5143 |
0.5097 |
0.5149 |
0.5156 |
0.5041 |
0.4818 |
10.1010 |
1000 |
0.4675 |
- |
- |
- |
- |
- |
- |
10.2020 |
1010 |
0.8518 |
- |
- |
- |
- |
- |
- |
10.3030 |
1020 |
0.3375 |
- |
- |
- |
- |
- |
- |
10.4040 |
1030 |
0.386 |
- |
- |
- |
- |
- |
- |
10.5051 |
1040 |
0.5168 |
- |
- |
- |
- |
- |
- |
10.6061 |
1050 |
1.2228 |
- |
- |
- |
- |
- |
- |
10.7071 |
1060 |
0.4282 |
- |
- |
- |
- |
- |
- |
10.8081 |
1070 |
0.282 |
- |
- |
- |
- |
- |
- |
10.9091 |
1080 |
0.9158 |
- |
- |
- |
- |
- |
- |
11.0 |
1089 |
- |
0.5390 |
0.5374 |
0.5394 |
0.5309 |
0.5189 |
0.4965 |
11.0101 |
1090 |
0.1981 |
- |
- |
- |
- |
- |
- |
11.1111 |
1100 |
0.2708 |
- |
- |
- |
- |
- |
- |
11.2121 |
1110 |
0.36 |
- |
- |
- |
- |
- |
- |
11.3131 |
1120 |
0.8882 |
- |
- |
- |
- |
- |
- |
11.4141 |
1130 |
0.3434 |
- |
- |
- |
- |
- |
- |
11.5152 |
1140 |
0.2293 |
- |
- |
- |
- |
- |
- |
11.6162 |
1150 |
0.6078 |
- |
- |
- |
- |
- |
- |
11.7172 |
1160 |
1.0283 |
- |
- |
- |
- |
- |
- |
11.8182 |
1170 |
1.1603 |
- |
- |
- |
- |
- |
- |
11.9192 |
1180 |
0.8614 |
- |
- |
- |
- |
- |
- |
12.0 |
1188 |
- |
0.5293 |
0.5319 |
0.5283 |
0.5299 |
0.5199 |
0.4938 |
12.0202 |
1190 |
0.3831 |
- |
- |
- |
- |
- |
- |
12.1212 |
1200 |
0.1211 |
- |
- |
- |
- |
- |
- |
12.2222 |
1210 |
0.5642 |
- |
- |
- |
- |
- |
- |
12.3232 |
1220 |
0.8418 |
- |
- |
- |
- |
- |
- |
12.4242 |
1230 |
0.3555 |
- |
- |
- |
- |
- |
- |
12.5253 |
1240 |
0.1138 |
- |
- |
- |
- |
- |
- |
12.6263 |
1250 |
0.1858 |
- |
- |
- |
- |
- |
- |
12.7273 |
1260 |
0.3527 |
- |
- |
- |
- |
- |
- |
12.8283 |
1270 |
0.9225 |
- |
- |
- |
- |
- |
- |
12.9293 |
1280 |
0.3991 |
- |
- |
- |
- |
- |
- |
13.0 |
1287 |
- |
0.5228 |
0.5254 |
0.5311 |
0.5268 |
0.5189 |
0.4959 |
13.0303 |
1290 |
0.1794 |
- |
- |
- |
- |
- |
- |
13.1313 |
1300 |
0.2153 |
- |
- |
- |
- |
- |
- |
13.2323 |
1310 |
0.14 |
- |
- |
- |
- |
- |
- |
13.3333 |
1320 |
0.5314 |
- |
- |
- |
- |
- |
- |
13.4343 |
1330 |
0.2421 |
- |
- |
- |
- |
- |
- |
13.5354 |
1340 |
0.2911 |
- |
- |
- |
- |
- |
- |
13.6364 |
1350 |
0.4683 |
- |
- |
- |
- |
- |
- |
13.7374 |
1360 |
0.3907 |
- |
- |
- |
- |
- |
- |
13.8384 |
1370 |
0.7038 |
- |
- |
- |
- |
- |
- |
13.9394 |
1380 |
0.0922 |
- |
- |
- |
- |
- |
- |
14.0 |
1386 |
- |
0.5197 |
0.5209 |
0.5271 |
0.5236 |
0.5143 |
0.5012 |
-1 |
-1 |
- |
0.5293 |
0.5319 |
0.5283 |
0.5299 |
0.5199 |
0.4938 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.11.13
- Sentence Transformers: 4.1.0
- Transformers: 4.51.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.8.1
- Datasets: 4.0.0
- Tokenizers: 0.21.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}