ModernBERT Embed base Legal Matryoshka

This is a sentence-transformers model finetuned from BAAI/bge-m3 on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("IoannisKat1/bge-m3-legal-matryoshka")
# Run inference
sentences = [
    'What are the consequences for unlawful interference with sensitive data?',
    "Failure to notify the Authority of file establishment or permit changes is punished by up to three years’ imprisonment and a fine of one to five million Drachmas.\nMaintaining a file without a permit or violating permit terms is punished by at least one year’s imprisonment and a fine of one to five million Drachmas.\nUnauthorized file interconnection or without permit is punished by up to three years’ imprisonment and a fine of one to five million Drachmas.\nUnlawful interference with personal data is punished by imprisonment and a fine; for sensitive data, at least one year’s imprisonment and a fine of one to ten million Drachmas.\nControllers who fail to comply with Authority decisions or violate data transfer rules face at least two years’ imprisonment and a fine of one to five million Drachmas.\nIf acts were committed for unlawful benefit or to cause harm, punishment is up to ten years’ imprisonment and a fine of two to ten million Drachmas.\nIf acts jeopardize democratic governance or national security, punishment is confinement in a penitentiary and a fine of five to ten million Drachmas.\nActs committed due to negligence result in at least three months’ imprisonment and a fine.\nIf the Controller is not a natural person, the responsible party is the representative or head of the organization with administrative or managerial duties.\nAuthorized members of the Authority may carry out preliminary investigations even without Prosecutor’s order for certain offenses.\nThe Authority's President must notify the Public Prosecutor of any offenses under investigation, forwarding all relevant evidence.\nPreliminary investigations must conclude within two months of charges, and trial must begin within three months of completion.\nContinuation of proceedings is allowed only once and for extremely important reasons, with adjournment not exceeding two months.\nFelonies under this law fall under the jurisdiction of the Court of Appeal.\n",
    'Any person who, in contravention of the provisions of this law or of the provisions of lawfully ratified multilateral international conventions on the protection of copyright, unlawfully makes a fixation of a work or of copies, reproduces them directly or indirectly, temporarily or permanently in any form, in whole or in part, translates, adapts, alters or transforms them, or distributes them to the public by sale or other means, or possesses with the intent of distributing them, rents, performs in public, broadcasts by radio or television or any other means, communicates to the public works or copies by any means, imports copies of a work illegally produced abroad without the consent of the author and, in general, exploits works, reproductions or copies being the object of copyright or acts against the moral right of the author to decide freely on the publication and the presentation of his work to the public without additions or deletions, shall be liable to imprisonment of no less than a year and to a fine from 2.900-15.000 Euro.\nWithout the permission of the performers: fixes their performance; directly or indirectly, temporarily or permanently reproduces by any means and form, in whole or in part, the fixation of their performance; distributes to the public the fixation of their performance or possesses them with the purpose of distribution; rents the fixation of their performance; broadcasts by radio and television by any means, the live performance, unless such broadcasting is rebroadcasting of a legitimate broadcasting; communicates to the public the live performance made by any means, except radio and television broadcasting; makes available to the public, by wire or wireless means, in such a way that members of the public may access them from a place and at a time individually chosen by them, the fixation of their performance.\nWithout the permission of phonogram producers (producers of sound recordings): directly or indirectly, temporarily or permanently reproduces by any means and form, in whole or in part, their phonograms; distributes to the public the above recordings, or possesses them with the purpose of distribution; rents the said recordings; makes available to the public, by wire or wireless means, in such a way that members of the public may access them from a place and at a time individually chosen by them, their phonograms; imports the said recordings produced abroad without their consent.\nWithout the permission of producers of audiovisual works (producers of visual or sound and visual recordings): directly or indirectly, temporarily or permanently reproduces by any means and form, in whole or in part, the original and the copies of their films; distributes to the public the above recordings, including the copies thereof, or possesses them with the purpose of distribution; rents the said recordings; makes available to the public, by wire or wireless means, in such a way that members of the public may access them from a place and at a time individually chosen by them, the original and the copies of their films; imports the said recordings produced abroad without their consent; broadcasts by radio or television by any means including satellite transmission and cable retransmission, as well as the communication to the public.\nWithout the permission of radio and television organizations: rebroadcasts their broadcasts by any means; presents their broadcasts to the public in places accessible to the public against payment of an entrance fee; fixes their broadcasts on sound or sound and visual recordings, regardless of whether the broadcasts are transmitted by wire or by the air, including by cable or satellite; directly or indirectly, temporarily or permanently reproduces by any means and form, in whole or in part, the fixation of their broadcasts; distributes to the public the recordings containing the fixation or their broadcasts; rents the recordings containing the fixation of their broadcasts; makes available to the public, by wire or wireless means, in such a way that members of the public may access them from a place and at a time individually chosen by them, the fixation of their broadcasts.\nIf the financial gain sought or the damage caused by the perpetration of an act listed in paragraphs (1) and (2), above, is particularly great, the sanction shall be not less than two years imprisonment and a fine of from 2 to 10 million drachmas. If the guilty party has perpetrated any of the aforementioned acts by profession or at a commercial scale or if the circumstances in connection with the perpetration of the act indicate that the guilty party poses a serious threat to the protection of copyright or related rights, the sanction shall be imprisonment of up to ten (10) years and a fine of from 5 to 10 million drachmas, together with the withdrawal of the trading license of the undertaking which has served as the vehicle for the act. The act shall be likewise deemed to have been perpetrated by way of standard practice if the guilty party has on a previous occasion been convicted of a contravention pursuant to the provisions of the Article or for a violation of the preceding copyright legislation and sentenced to a non-redeemable period of imprisonment. Any infringement of copyright and related rights in the form of felony is tried by the competent Three-member Court of Appeal for Felonies.\nAny person who did not pay the remuneration provided for by Article 18, paragraph (3) hereof to a collecting society is punished with the sanction of paragraph (1), (2) and (3). The same sentence is imposed on the debtor who, after the issuance of the decision of the one-member first instance court, does not submit the declaration under the provisions of article 18, par. 6, of this law.\nThe sanctions specified in paragraph (1), above, shall be applicable likewise to any person who: uses or distributes, or possesses with the intent to distribute, any system or means whose sole purpose is to facilitate the unpermitted removal or neutralization of a technical system used to protect a computer program; manufactures or imports or distributes, or possesses with intent to distribute, equipment and other materials utilizable for the reproduction of a work which do not conform to the specifications determined pursuant to Article 59 of this Law; manufactures or imports or distributes, or possesses with intent to distribute, objects which can thwart the efficacy of the above-mentioned specifications, or engages in an act which can have that result; reproduces or uses a work without utilizing the equipment or without applying the systems specified pursuant to Article 60 of this Law; distributes, or possesses with intent to distribute, a phonogram or film without the special mark or control label specified pursuant to Article 61 of this Law.\nWhere a sentence of imprisonment is imposed with the option of redeemability, the sum payable for the redemption shall be 10 times the sum specified as per the case in the Penal Code.\nWhere mitigating circumstances exist, the fine imposed shall not be less than half of the minimum fine imposable as per the case under this Law.\nAny person who proceeds to authorized temporary or permanent reproduction of the database, translation, adaptation, arrangement and any other alteration of the database, distribution to the public of the database or of copies thereof, communication, display or performance of the database to the public, is punished by imprisonment of at least one (1) year and a fine of one (1) to five (5) million drachmas.\nAny person who proceeds to extraction and/or re-utilization of the whole or of a substantial part of the contents of the database without the authorization of the author thereof, is punished by imprisonment of at least one (1) year and a fine of one (1) to five (5) million drachmas (article 12 of Directive 96/9).\nWhen the object of the infringement refers to computer software, the culpable character of the action, as described in paragraph 1 of article 65A and under the prerequisites provided there, is raised under the condition that the infringer proceeds in the unreserved payment of the administrative fee and the infringement concerns a quantity of up to 50 programs.\nWhen the object of infringement concerns recordings of sound in which a work protected by copyright law has been recorded, the unreserved payment of an administrative fee according to the stipulation of par.2 of article 65A and under the prerequisites provided there, the culpable character of the action is raised under the condition that the infringement concerns a quantity of up to five hundred (500) illegal sound recording carriers.\nThe payment of the administrative fee and the raising of the culpable character of the action, do not relieve the infringers from the duty of buying off the copyright and related rights or from the duty of compensating and paying the rest expenses to the holders of these rights, according to the provisions of the relevant laws.\nIn case of recidivism during the same financial year the administrative fee provided for by article 65A doubles.\n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.4596
cosine_accuracy@3 0.5025
cosine_accuracy@5 0.5202
cosine_accuracy@10 0.5657
cosine_precision@1 0.4596
cosine_precision@3 0.4453
cosine_precision@5 0.4167
cosine_precision@10 0.3629
cosine_recall@1 0.0917
cosine_recall@3 0.2307
cosine_recall@5 0.3069
cosine_recall@10 0.4212
cosine_ndcg@10 0.5077
cosine_mrr@10 0.4843
cosine_map@100 0.5665

Information Retrieval

Metric Value
cosine_accuracy@1 0.4596
cosine_accuracy@3 0.4975
cosine_accuracy@5 0.5152
cosine_accuracy@10 0.5732
cosine_precision@1 0.4596
cosine_precision@3 0.4436
cosine_precision@5 0.4146
cosine_precision@10 0.3662
cosine_recall@1 0.0912
cosine_recall@3 0.229
cosine_recall@5 0.304
cosine_recall@10 0.4239
cosine_ndcg@10 0.5097
cosine_mrr@10 0.4847
cosine_map@100 0.5637

Information Retrieval

Metric Value
cosine_accuracy@1 0.4343
cosine_accuracy@3 0.4596
cosine_accuracy@5 0.4773
cosine_accuracy@10 0.5303
cosine_precision@1 0.4343
cosine_precision@3 0.4133
cosine_precision@5 0.3859
cosine_precision@10 0.3417
cosine_recall@1 0.0867
cosine_recall@3 0.21
cosine_recall@5 0.2768
cosine_recall@10 0.3911
cosine_ndcg@10 0.4754
cosine_mrr@10 0.4539
cosine_map@100 0.5312

Information Retrieval

Metric Value
cosine_accuracy@1 0.4141
cosine_accuracy@3 0.447
cosine_accuracy@5 0.4672
cosine_accuracy@10 0.5051
cosine_precision@1 0.4141
cosine_precision@3 0.399
cosine_precision@5 0.3773
cosine_precision@10 0.3341
cosine_recall@1 0.0796
cosine_recall@3 0.1965
cosine_recall@5 0.2641
cosine_recall@10 0.3707
cosine_ndcg@10 0.4573
cosine_mrr@10 0.435
cosine_map@100 0.5079

Information Retrieval

Metric Value
cosine_accuracy@1 0.3662
cosine_accuracy@3 0.3889
cosine_accuracy@5 0.4141
cosine_accuracy@10 0.4419
cosine_precision@1 0.3662
cosine_precision@3 0.351
cosine_precision@5 0.3293
cosine_precision@10 0.2907
cosine_recall@1 0.0705
cosine_recall@3 0.1787
cosine_recall@5 0.2375
cosine_recall@10 0.3326
cosine_ndcg@10 0.4028
cosine_mrr@10 0.3831
cosine_map@100 0.456

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 1,580 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 7 tokens
    • mean: 17.16 tokens
    • max: 39 tokens
    • min: 27 tokens
    • mean: 714.22 tokens
    • max: 2824 tokens
  • Samples:
    anchor positive
    At what time must the controller provide the data subject with the required information? 1.Where personal data relating to a data subject are collected from the data subject, the controller shall, at the time when personal data are obtained, provide the data subject with all of the following information: (a) the identity and the contact details of the controller and, where applicable, of the controller's representative; (b) the contact details of the data protection officer, where applicable; (c) the purposes of the processing for which the personal data are intended as well as the legal basis for the processing; 4.5.2016 L 119/40 (d) where the processing is based on point (f) of Article 6(1), the legitimate interests pursued by the controller or by a third party; (e) the recipients or categories of recipients of the personal data, if any; (f) where applicable, the fact that the controller intends to transfer personal data to a third country or international organisation and the existence or absence of an adequacy decision by the Commission, or in the case of trans...
    What information does the duty of professional secrecy cover during the term of office? 1.Each Member State shall provide by law for all of the following: (a) the establishment of each supervisory authority; 4.5.2016 L 119/66 (b) the qualifications and eligibility conditions required to be appointed as member of each supervisory authority; (c) the rules and procedures for the appointment of the member or members of each supervisory authority; (d) the duration of the term of the member or members of each supervisory authority of no less than four years, except for the first appointment after 24 May 2016, part of which may take place for a shorter period where that is necessary to protect the independence of the supervisory authority by means of a staggered appointment procedure; (e) whether and, if so, for how many terms the member or members of each supervisory authority is eligible for reappointment; (f) the conditions governing the obligations of the member or members and staff of each supervisory authority, prohibitions on actions, occupations and benefits inco...
    Under what circumstances should the controller be allowed to further process personal data? The processing of personal data for purposes other than those for which the personal data were initially collected should be allowed only where the processing is compatible with the purposes for which the personal data were initially collected. In such a case, no legal basis separate from that which allowed the collection of the personal data is required. If the processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller, Union or Member State law may determine and specify the tasks and purposes for which the further processing should be regarded as compatible and lawful. Further processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes should be considered to be compatible lawful processing operations. The legal basis provided by Union or Member State law for the processing of personal data may also provide a legal basis for ...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • gradient_accumulation_steps: 2
  • learning_rate: 2e-05
  • num_train_epochs: 15
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0.0506 10 4.3827 - - - - -
0.1013 20 5.2903 - - - - -
0.1519 30 5.2656 - - - - -
0.2025 40 4.4887 - - - - -
0.2532 50 4.7901 - - - - -
0.3038 60 4.6387 - - - - -
0.3544 70 5.4387 - - - - -
0.4051 80 6.2548 - - - - -
0.4557 90 4.9473 - - - - -
0.5063 100 4.9107 - - - - -
0.5570 110 5.0153 - - - - -
0.6076 120 4.1909 - - - - -
0.6582 130 4.0142 - - - - -
0.7089 140 4.5879 - - - - -
0.7595 150 2.8352 - - - - -
0.8101 160 3.7297 - - - - -
0.8608 170 3.1522 - - - - -
0.9114 180 3.4329 - - - - -
0.9620 190 2.7274 - - - - -
1.0 198 - 0.4858 0.4823 0.4370 0.4147 0.3524
1.0101 200 2.9423 - - - - -
1.0608 210 3.0553 - - - - -
1.1114 220 1.98 - - - - -
1.1620 230 2.2939 - - - - -
1.2127 240 1.4246 - - - - -
1.2633 250 3.4192 - - - - -
1.3139 260 2.1508 - - - - -
1.3646 270 3.2516 - - - - -
1.4152 280 3.213 - - - - -
1.4658 290 2.6625 - - - - -
1.5165 300 1.8449 - - - - -
1.5671 310 3.1901 - - - - -
1.6177 320 1.6783 - - - - -
1.6684 330 3.9872 - - - - -
1.7190 340 3.4367 - - - - -
1.7696 350 3.2959 - - - - -
1.8203 360 2.8447 - - - - -
1.8709 370 3.0028 - - - - -
1.9215 380 2.7218 - - - - -
1.9722 390 2.4789 - - - - -
2.0 396 - 0.5077 0.5097 0.4754 0.4573 0.4028
2.0203 400 2.7682 - - - - -
2.0709 410 2.1641 - - - - -
2.1215 420 1.9279 - - - - -
2.1722 430 0.8415 - - - - -
2.2228 440 2.4362 - - - - -
2.2734 450 1.3782 - - - - -
2.3241 460 1.0464 - - - - -
2.3747 470 3.3324 - - - - -
2.4253 480 1.7592 - - - - -
2.4759 490 1.3254 - - - - -
2.5266 500 0.9768 - - - - -
2.5772 510 3.603 - - - - -
2.6278 520 1.0011 - - - - -
2.6785 530 1.1125 - - - - -
2.7291 540 1.3817 - - - - -
2.7797 550 1.2609 - - - - -
2.8304 560 0.5562 - - - - -
2.8810 570 2.1196 - - - - -
2.9316 580 2.9498 - - - - -
2.9823 590 1.8963 - - - - -
3.0 594 - 0.4875 0.4863 0.4686 0.4493 0.4139
3.0304 600 1.1573 - - - - -
3.0810 610 1.2194 - - - - -
3.1316 620 0.5912 - - - - -
3.1823 630 1.0336 - - - - -
3.2329 640 1.4426 - - - - -
3.2835 650 1.5099 - - - - -
3.3342 660 1.0995 - - - - -
3.3848 670 1.0333 - - - - -
3.4354 680 1.8372 - - - - -
3.4861 690 1.0771 - - - - -
3.5367 700 0.2972 - - - - -
3.5873 710 0.7128 - - - - -
3.6380 720 0.9087 - - - - -
3.6886 730 2.1019 - - - - -
3.7392 740 1.1736 - - - - -
3.7899 750 1.8183 - - - - -
3.8405 760 1.5602 - - - - -
3.8911 770 1.5488 - - - - -
3.9418 780 2.9643 - - - - -
3.9924 790 0.965 - - - - -
4.0 792 - 0.4893 0.4956 0.4825 0.4549 0.4159
-1 -1 - 0.5077 0.5097 0.4754 0.4573 0.4028
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.8.1
  • Datasets: 4.0.0
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
41
Safetensors
Model size
568M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for IoannisKat1/bge-m3-legal-matryoshka

Base model

BAAI/bge-m3
Finetuned
(303)
this model

Evaluation results