ModernBERT Embed base Legal Matryoshka

This is a sentence-transformers model finetuned from distilbert/distilbert-base-uncased on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: distilbert/distilbert-base-uncased
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("IoannisKat1/distilbert-base-uncased-legal-matryoshka")
# Run inference
sentences = [
    'What should Directive 2002/58/EC be amended accordingly to clarify?',
    'This Regulation should apply to all matters concerning the protection of fundamental rights and freedoms vis-à- vis the processing of personal data which are not subject to specific obligations with the same objective set out in Directive 2002/58/EC of the European Parliament and of the Council (2), including the obligations on the controller and the rights of natural persons. In order to clarify the relationship between this Regulation and Directive 2002/58/EC, that Directive should be amended accordingly. Once this Regulation is adopted, Directive 2002/58/EC should be reviewed in particular in order to ensure consistency with this Regulation, 4.5.2016 L 119/31 Official Journal of the European Union EN',
    '**Court (Civil/Criminal): Civil**  \n**Provisions:**  \n**Time of commission of the act:**  \n**Outcome (not guilty, guilty):**  \n**Reasoning:** Partially accepts the lawsuit.  \n**Facts:** The plaintiff, who works as a lawyer, maintains a savings account with the defendant banking corporation under account number GR.............. Pursuant to a contract dated June 11, 2010, established in Thessaloniki between the defendant and the plaintiff, the plaintiff was granted access to the electronic banking system (e-banking) to conduct banking transactions remotely. On October 10, 2020, the plaintiff fell victim to electronic fraud through the "phishing" method, whereby an unknown perpetrator managed to extract and transfer €3,000.00 from the plaintiff’s account to another account of the same bank. Specifically, on that day at 6:51 a.m., the plaintiff received an email from the sender ".........", with the address ..........., informing him that his debit card had been suspended and that online payments and cash withdrawals could not be made until the issue was resolved. The email urged him to confirm his details within the next 72 hours by following a link titled "card activation."  \nThe plaintiff read the above email on his mobile phone around 8:00 a.m., and believing it came from the defendant, he followed the instructions and accessed a website that was identical (a clone) to that of the defendant. On this page, he was asked to enter his login credentials to connect to the service, which he did, and he was subsequently asked to input his debit card details for the alleged activation, which he also provided. Then, to complete the process, a number was sent to his mobile phone at 8:07 a.m. from the sender ........, which he entered, and two minutes later he received a message from the same sender in English stating that the quick access code had been activated on his mobile. A few minutes later, at 8:18 a.m., he received an email from the defendant informing him of the transfer of €3,000.00 from his account to account number GR ........... held at the same bank, with the beneficiary\'s details being .......... As soon as the plaintiff read this, he immediately called the defendant\'s call center and canceled his debit card, the access codes for the service ......., and locked the application .......... At the same time, he verbally submitted a request to dispute and cancel the contested transaction, and in a subsequent phone call, he also canceled his credit card. On the same day, he also sent an email to the defendant informing them in writing of the above and requesting the cancellation of the transaction and the return of the amount of €3,000.00 to his account, as this transfer was not made by him but by an unknown perpetrator through electronic fraud and was not approved by him. It should also be noted that the plaintiff, as the sole beneficiary according to the aforementioned contract for using the defendant\'s Internet Banking service, never received any update via SMS or the VIBER application from the bank regarding the transaction details before its completion, nor did he receive a one-time code (OTP) to approve the contested transaction. He subsequently filed a complaint against unknown persons at the Cyber Crime Division for the crime of fraud. The defendant sent an email to the plaintiff on October 16, 2020, informing him that his request had been forwarded to the appropriate department of the bank for investigation, stating that the bank would never send him an email or SMS asking him to enter his personal data and that as of October 7, 2020, there was a notice posted for its customers regarding malicious attempts to steal personal data in the "Our News" section on ....... A month after the disputed incident, on November 10, 2020, an amount of €2,296.82 was transferred to the plaintiff\'s account from the account to which the fraudulent credit had been made. The plaintiff immediately sent an email to the defendant asking to be informed whether this transfer was a return of part of the amount that had been illegally withdrawn from his account and requested the return of the remaining amount of €703.18. In its response dated January 13, 2021, the defendant confirmed that the aforementioned amount indeed came from the account to which the fraudulent credit had been made, following a freeze of that account initiated by the defendant during the investigation of the incident, but refused to return the remaining amount, claiming it bore no responsibility for the leak of the personal codes to third parties, according to the terms of the service contract established between them.  \nFrom the entirety of the evidence presented to the court, there is no indication of the authenticity of the contested transaction, as the plaintiff did not give his consent for the execution of the transfer of the amount of €3,000.00, especially in light of the provision in Article 72 paragraph 2 of Law 4537/2018 stating that the mere use of the Internet Banking service by the plaintiff does not necessarily constitute sufficient evidence that the payer approved the payment action. Specifically, it was proven that the contested transaction was not carried out following a strong identification of the plaintiff – the sole beneficiary of the account – and his approval, as the latter may have entered his personal codes on the counterfeit website; however, he was never informed, before the completion of the contested transaction, of the amount that would be transferred from his account to a third-party account, nor did he receive on his mobile phone, either via SMS or through the VIBER application or any other means, the one-time code - extra PIN for its completion, which he was required to enter to approve the contested transaction (payment action) and thus complete his identification, a fact that was not countered by any evidence from the defendant. Furthermore, it is noted that the defendant\'s claims that it bears no responsibility under the terms of the banking services contract, whereby it is not liable for any damage to its customer in cases of unauthorized use of their personal access codes to the Internet Banking service, are to be rejected as fundamentally unfounded. This is because the aforementioned contractual terms are invalid according to the provision of Article 103 of Law 4537/2018, as they contradict the provisions of Articles 71, 73, and 92 of the same Law, which provide for the provider\'s universal liability and its exemption only for unusual and unforeseen circumstances that are beyond the control of the party invoking them and whose consequences could not have been avoided despite all efforts to the contrary; these provisions establish mandatory law in favor of users, as according to Article 103 of Law 4537/2018, payment service providers are prohibited from deviating from the provisions to the detriment of payment service users, unless the possibility of deviation is explicitly provided and they can decide to offer only more favorable terms to payment service users; the aforementioned contractual terms do not constitute more favorable terms but rather disadvantageous terms for the payment service user. In this case, however, the defendant did not prove the authenticity of the transaction and its approval by the plaintiff and did not invoke, nor did any unusual and unforeseen circumstances beyond its control, the consequences of which could not have been avoided despite all efforts to the contrary, come to light. Therefore, the contested transaction transferring the amount of €3,000.00 is considered, in the absence of demonstrable consent from the plaintiff, unapproved according to the provisions of Article 64 of Law 4537/2018, and the defendant\'s contrary claims are rejected, especially since the plaintiff proceeded, according to Article 71 paragraph 1 of Law 4537/2018, without undue delay to notify the defendant regarding the contested unapproved payment action. Consequently, the defendant is liable for compensating the plaintiff for the positive damage he suffered under Article 73 of Law 4537/2018 and is obliged to pay him the requested amount of €703.18, while the plaintiff’s fault in the occurrence of this damage cannot be established, as he entered his personal details in an online environment that was a faithful imitation of that of the defendant, as evidenced by the comparison of the screenshots of the fake website and the real website provided by the plaintiff, a fact that he could not have known while being fully convinced that he was transacting with the defendant. Furthermore, the defendant’s liability to compensate the plaintiff is based on the provision of Article 8 of Law 2251/1994, which applies in this case, as the plaintiff\'s damage resulted from inadequate fulfillment of its obligations in the context of providing its services, but also on the provision of Article 914 of the Civil Code in the sense of omission on its part of unlawfully and culpably imposed actions. In this case, given that during the relevant period there had been a multitude of similar incidents of fraud against the defendant\'s customers, the latter, as a service provider to the consumer public and bearing transactional obligations of care and security towards them, displayed gross negligence regarding the security provided for electronic transaction services, which was compromised by the fraudulent theft of funds, as it did not comply with all required high-security measures for executing the contested transaction, failing to implement the strict customer identification verification process and to check the authenticity of the account to which the funds were sent, thus not assuming the suspicious nature of the transaction, did not adopt comprehensive and improved protective measures to fully protect its customers against malicious attacks and online fraud and to prevent the infiltration of unauthorized third parties, nor did it fulfill its obligations to inform, accurately inform, and warn its consumers - customers, as it failed to adequately inform them of attempts to steal their personal data through the sending of informative emails or SMS, while merely posting in a section rather than on a central banner (as it later did) does not constitute adequate information such that it meets the requirement of protecting its customers and the increased safeguarding of their interests. Although the plaintiff acted promptly and informed the defendant on the same day about the contested incident, the defendant did not act as promptly regarding the investigation of the incident and the freezing of the account that held the fraudulent credit to prevent the plaintiff\'s loss, but only returned part of the funds to the plaintiff a month later. This behavior, beyond being culpable due to gross negligence, was also unlawful, as it would have been illegal even without the contractual relationship, as contrary to the provisions of Law 4537/2018 and Law 2251/1994, regarding the lack of security of the services that the consumer is legitimately entitled to expect, as well as the building of trust that is essential in banking transactions, elements that it was obligated to provide within the sphere of the services offered, and contrary to the principles of good faith and commercial ethics, as crystallized in the provision of Article 288 of the Civil Code, as well as the general duty imposed by Article 914 of the Civil Code not to cause harm to another culpably. This resulted not only in positive damage to the plaintiff but also in causing him moral harm consisting of his mental distress and the disruption, agitation, and sorrow he experienced, for which he must be awarded financial compensation. Taking into account all the general circumstances of the case, the extent of the plaintiff\'s damage, the severity of the defendant\'s fault, the mental distress suffered by the plaintiff, the insecurity he felt regarding his deposits, the sorrow he experienced, and the stress caused by his financial loss, which occurred during the pandemic period when his earnings from his professional activity had significantly decreased, as well as the financial and social situation of the parties, it is the court\'s opinion that he should be granted, as financial compensation for his moral harm, an amount of €250.00, which is deemed reasonable and fair. Therefore, the total monetary amount that the plaintiff is entitled to for his positive damage and financial compensation for the moral harm suffered amounts to a total of (€703.18 + €250.00) = €953.18.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.3965
cosine_accuracy@3 0.4369
cosine_accuracy@5 0.4848
cosine_accuracy@10 0.5581
cosine_precision@1 0.3965
cosine_precision@3 0.3838
cosine_precision@5 0.3692
cosine_precision@10 0.3437
cosine_recall@1 0.0783
cosine_recall@3 0.1923
cosine_recall@5 0.2688
cosine_recall@10 0.3973
cosine_ndcg@10 0.4672
cosine_mrr@10 0.4307
cosine_map@100 0.5179

Information Retrieval

Metric Value
cosine_accuracy@1 0.3965
cosine_accuracy@3 0.4343
cosine_accuracy@5 0.4697
cosine_accuracy@10 0.5404
cosine_precision@1 0.3965
cosine_precision@3 0.3847
cosine_precision@5 0.3667
cosine_precision@10 0.3381
cosine_recall@1 0.0759
cosine_recall@3 0.1893
cosine_recall@5 0.2597
cosine_recall@10 0.3888
cosine_ndcg@10 0.4602
cosine_mrr@10 0.4271
cosine_map@100 0.5111

Information Retrieval

Metric Value
cosine_accuracy@1 0.399
cosine_accuracy@3 0.4394
cosine_accuracy@5 0.4848
cosine_accuracy@10 0.548
cosine_precision@1 0.399
cosine_precision@3 0.3897
cosine_precision@5 0.3747
cosine_precision@10 0.3455
cosine_recall@1 0.0761
cosine_recall@3 0.1886
cosine_recall@5 0.2658
cosine_recall@10 0.3962
cosine_ndcg@10 0.469
cosine_mrr@10 0.4324
cosine_map@100 0.5116

Information Retrieval

Metric Value
cosine_accuracy@1 0.4015
cosine_accuracy@3 0.4242
cosine_accuracy@5 0.4646
cosine_accuracy@10 0.5328
cosine_precision@1 0.4015
cosine_precision@3 0.3855
cosine_precision@5 0.3657
cosine_precision@10 0.3389
cosine_recall@1 0.0744
cosine_recall@3 0.1846
cosine_recall@5 0.2542
cosine_recall@10 0.3745
cosine_ndcg@10 0.4565
cosine_mrr@10 0.4277
cosine_map@100 0.5058

Information Retrieval

Metric Value
cosine_accuracy@1 0.351
cosine_accuracy@3 0.3838
cosine_accuracy@5 0.4343
cosine_accuracy@10 0.5025
cosine_precision@1 0.351
cosine_precision@3 0.3409
cosine_precision@5 0.3338
cosine_precision@10 0.3235
cosine_recall@1 0.0611
cosine_recall@3 0.1526
cosine_recall@5 0.2174
cosine_recall@10 0.3484
cosine_ndcg@10 0.4198
cosine_mrr@10 0.3824
cosine_map@100 0.4643

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 1,580 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 7 tokens
    • mean: 15.42 tokens
    • max: 37 tokens
    • min: 25 tokens
    • mean: 362.52 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    When does this Regulation provide for the possibility for Member States to restrict certain obligations and rights by law? The protection of natural persons with regard to the processing of personal data by competent authorities for the purposes of the prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penalties, including the safeguarding against and the prevention of threats to public security and the free movement of such data, is the subject of a specific Union legal act. This Regulation should not, therefore, apply to processing activities for those purposes. However, personal data processed by public authorities under this Regulation should, when used for those purposes, be governed by a more specific Union legal act, namely Directive (EU) 2016/680 of the European Parliament and of the Council (1). Member States may entrust competent authorities within the meaning of Directive (EU) 2016/680 with tasks which are not necessarily carried out for the purposes of the prevention, investigation, detection or prosecution of criminal offences or the execution...
    What can make a supervisory authority concerned? 1) 'personal data' means any information relating to an identified or identifiable natural person ('data subject'); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;
    (2) ‘processing’ means any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction;
    (3) ‘restriction of processing’ means the marking of stored personal data with the aim of limiting their processin...
    According to Article 338(2) TFEU, what should European statistics comply with? The confidential information which the Union and national statistical authorities collect for the production of official European and official national statistics should be protected. European statistics should be developed, produced and disseminated in accordance with the statistical principles as set out in Article 338(2) TFEU, while national statistics should also comply with Member State law. Regulation (EC) No 223/2009 of the European Parliament and of the Council (2) provides further specifications on statistical confidentiality for European statistics.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • gradient_accumulation_steps: 2
  • learning_rate: 2e-05
  • num_train_epochs: 15
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0.1010 10 16.4911 - - - - -
0.2020 20 16.4179 - - - - -
0.3030 30 15.0061 - - - - -
0.4040 40 13.4559 - - - - -
0.5051 50 12.4422 - - - - -
0.6061 60 10.7792 - - - - -
0.7071 70 9.9317 - - - - -
0.8081 80 9.3394 - - - - -
0.9091 90 9.9798 - - - - -
1.0 99 - 0.3153 0.3147 0.3057 0.2943 0.2670
1.0101 100 8.5715 - - - - -
1.1111 110 6.6493 - - - - -
1.2121 120 7.3416 - - - - -
1.3131 130 6.5684 - - - - -
1.4141 140 5.8657 - - - - -
1.5152 150 7.3906 - - - - -
1.6162 160 6.374 - - - - -
1.7172 170 5.0515 - - - - -
1.8182 180 6.4225 - - - - -
1.9192 190 6.0135 - - - - -
2.0 198 - 0.3917 0.4038 0.3987 0.3942 0.3424
2.0202 200 4.7693 - - - - -
2.1212 210 4.2566 - - - - -
2.2222 220 4.9353 - - - - -
2.3232 230 4.7827 - - - - -
2.4242 240 3.4413 - - - - -
2.5253 250 3.6339 - - - - -
2.6263 260 3.365 - - - - -
2.7273 270 4.1892 - - - - -
2.8283 280 4.2324 - - - - -
2.9293 290 3.6566 - - - - -
3.0 297 - 0.4127 0.3998 0.3888 0.3787 0.3629
3.0303 300 4.0928 - - - - -
3.1313 310 2.9843 - - - - -
3.2323 320 2.6601 - - - - -
3.3333 330 3.6857 - - - - -
3.4343 340 2.3096 - - - - -
3.5354 350 3.32 - - - - -
3.6364 360 1.5586 - - - - -
3.7374 370 2.8804 - - - - -
3.8384 380 2.0954 - - - - -
3.9394 390 3.6743 - - - - -
4.0 396 - 0.4292 0.4472 0.4404 0.4301 0.3844
4.0404 400 2.739 - - - - -
4.1414 410 2.3734 - - - - -
4.2424 420 2.0419 - - - - -
4.3434 430 1.5186 - - - - -
4.4444 440 2.0666 - - - - -
4.5455 450 2.2424 - - - - -
4.6465 460 1.928 - - - - -
4.7475 470 2.6343 - - - - -
4.8485 480 1.7317 - - - - -
4.9495 490 2.191 - - - - -
5.0 495 - 0.4393 0.4538 0.4483 0.4328 0.3806
5.0505 500 2.0808 - - - - -
5.1515 510 1.575 - - - - -
5.2525 520 1.1846 - - - - -
5.3535 530 2.0336 - - - - -
5.4545 540 1.3168 - - - - -
5.5556 550 1.2292 - - - - -
5.6566 560 1.9738 - - - - -
5.7576 570 1.6888 - - - - -
5.8586 580 1.2368 - - - - -
5.9596 590 2.1541 - - - - -
6.0 594 - 0.4443 0.4415 0.4471 0.4425 0.3936
6.0606 600 0.9948 - - - - -
6.1616 610 1.24 - - - - -
6.2626 620 1.1465 - - - - -
6.3636 630 1.2529 - - - - -
6.4646 640 0.5008 - - - - -
6.5657 650 1.8585 - - - - -
6.6667 660 1.8499 - - - - -
6.7677 670 1.6133 - - - - -
6.8687 680 1.0404 - - - - -
6.9697 690 1.0182 - - - - -
7.0 693 - 0.4672 0.4602 0.469 0.4565 0.4198
7.0707 700 0.7695 - - - - -
7.1717 710 0.6383 - - - - -
7.2727 720 0.7409 - - - - -
7.3737 730 0.8789 - - - - -
7.4747 740 1.313 - - - - -
7.5758 750 0.7437 - - - - -
7.6768 760 1.3037 - - - - -
7.7778 770 1.1243 - - - - -
7.8788 780 1.1979 - - - - -
7.9798 790 1.3566 - - - - -
8.0 792 - 0.4631 0.4621 0.4605 0.4390 0.4201
8.0808 800 0.8016 - - - - -
8.1818 810 0.6852 - - - - -
8.2828 820 0.4035 - - - - -
8.3838 830 0.7956 - - - - -
8.4848 840 0.8912 - - - - -
8.5859 850 0.8675 - - - - -
8.6869 860 0.9076 - - - - -
8.7879 870 0.6158 - - - - -
8.8889 880 1.1375 - - - - -
8.9899 890 1.2539 - - - - -
9.0 891 - 0.4547 0.4600 0.4554 0.4239 0.4011
-1 -1 - 0.4672 0.4602 0.4690 0.4565 0.4198
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.8.1
  • Datasets: 4.0.0
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
5
Safetensors
Model size
66.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for IoannisKat1/distilbert-base-uncased-legal-matryoshka

Finetuned
(9123)
this model

Evaluation results