adriansanz's picture
Add new SentenceTransformer model.
241cb21 verified
metadata
base_model: BAAI/bge-m3
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:5520
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Queda exclosa de la prohibició, dintre de les àrees recreatives i
      d'acampada i en parcel·les de les urbanitzacions, la utilització dels
      fogons de gas i de barbacoes d'obra amb mataguspires.
    sentences:
      - Què està prohibit fer en àrees d'acampada?
      - Quin és el benefici de la reserva d'un equipament municipal?
      - >-
        Quin és el benefici de la targeta d'aparcament individual per a
        l'autonomia personal?
  - source_sentence: >-
      Aquest tràmit permet participar en processos oberts de selecció i provisió
      de personal de l'Ajuntament, i fer el pagament de la taxa per drets
      d'examen establerta en la convocatòria.
    sentences:
      - >-
        Quin és el requisit per participar en un procés de selecció de personal
        de l'Ajuntament?
      - >-
        On es pot trobar la relació de requeriments de documentació per a l'ajut
        de menjador escolar?
      - >-
        Quin és el tipus d'activitats que es poden practicar amb les armes de 4a
        categoria?
  - source_sentence: Sol·licitar la cessió temporal d’un compostador domèstic.
    sentences:
      - Quin és el requisit per a la tala d'arbres aïllats en sòl urbà?
      - Quin és el paper de la persona interessada en aquest tràmit?
      - >-
        Quin és el paper del compostador domèstic en la reducció de les
        emissions de gasos d'efecte hivernacle?
  - source_sentence: Matriculació a l'Escola Bressol Municipal El Patufet.
    sentences:
      - >-
        Quin és el termini màxim per a deutes de 1.500,01 fins a 6.000,00 euros
        en el criteri excepcional?
      - Quin és el lloc on es realitza el tràmit de matrícula?
      - Quin és el lloc on es realitza el taller 'Informàtica nivell bàsic'?
  - source_sentence: >-
      Aquest tipus de transmissió entre cedent i cessionari només podrà ser de
      caràcter gratuït i no condicionada.
    sentences:
      - >-
        Quin és el caràcter de la transmissió de drets funeraris entre cedent i
        cessionari?
      - >-
        Quin és el propòsit de la comunicació prèvia en relació amb la
        intervenció definitiva?
      - Quin és el propòsit de la Deixalleria municipal?
model-index:
  - name: SentenceTransformer based on BAAI/bge-m3
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 1024
          type: dim_1024
        metrics:
          - type: cosine_accuracy@1
            value: 0.04782608695652174
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.20869565217391303
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.30869565217391304
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.5565217391304348
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.04782608695652174
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.06956521739130433
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.061739130434782616
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.055652173913043466
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.04782608695652174
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.20869565217391303
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.30869565217391304
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.5565217391304348
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.25888429095047366
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.16955314009661854
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.18763324173665294
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.06086956521739131
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.21304347826086956
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.30434782608695654
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.5565217391304348
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.06086956521739131
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.07101449275362319
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06086956521739131
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.055652173913043466
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.06086956521739131
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.21304347826086956
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.30434782608695654
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.5565217391304348
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2637812435357463
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.17599723947550047
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.19341889075062485
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.0782608695652174
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.21739130434782608
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.34347826086956523
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.5695652173913044
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.0782608695652174
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.07246376811594202
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06869565217391305
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.05695652173913043
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.0782608695652174
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.21739130434782608
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.34347826086956523
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.5695652173913044
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.28117776588045035
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.1947342995169084
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.21224466664057137
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.05217391304347826
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.20869565217391303
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3173913043478261
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.5130434782608696
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.05217391304347826
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.06956521739130433
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06347826086956522
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.05130434782608694
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.05217391304347826
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.20869565217391303
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.3173913043478261
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.5130434782608696
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.24833360148474737
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.16793305728088342
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.1892957688791951
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.05652173913043478
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.22608695652173913
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.32608695652173914
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.5434782608695652
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.05652173913043478
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.0753623188405797
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06521739130434782
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.05434782608695651
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.05652173913043478
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.22608695652173913
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.32608695652173914
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.5434782608695652
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2660596038952714
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.18197895100069028
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.20038255187663148
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.05652173913043478
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.21739130434782608
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3173913043478261
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.5434782608695652
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.05652173913043478
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.07246376811594202
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06347826086956522
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.054347826086956506
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.05652173913043478
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.21739130434782608
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.3173913043478261
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.5434782608695652
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2641081743881476
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.17965838509316792
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.19707496290303578
            name: Cosine Map@100

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3 on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("adriansanz/sqv-v5-10ep")
# Run inference
sentences = [
    'Aquest tipus de transmissió entre cedent i cessionari només podrà ser de caràcter gratuït i no condicionada.',
    'Quin és el caràcter de la transmissió de drets funeraris entre cedent i cessionari?',
    'Quin és el propòsit de la Deixalleria municipal?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.0478
cosine_accuracy@3 0.2087
cosine_accuracy@5 0.3087
cosine_accuracy@10 0.5565
cosine_precision@1 0.0478
cosine_precision@3 0.0696
cosine_precision@5 0.0617
cosine_precision@10 0.0557
cosine_recall@1 0.0478
cosine_recall@3 0.2087
cosine_recall@5 0.3087
cosine_recall@10 0.5565
cosine_ndcg@10 0.2589
cosine_mrr@10 0.1696
cosine_map@100 0.1876

Information Retrieval

Metric Value
cosine_accuracy@1 0.0609
cosine_accuracy@3 0.213
cosine_accuracy@5 0.3043
cosine_accuracy@10 0.5565
cosine_precision@1 0.0609
cosine_precision@3 0.071
cosine_precision@5 0.0609
cosine_precision@10 0.0557
cosine_recall@1 0.0609
cosine_recall@3 0.213
cosine_recall@5 0.3043
cosine_recall@10 0.5565
cosine_ndcg@10 0.2638
cosine_mrr@10 0.176
cosine_map@100 0.1934

Information Retrieval

Metric Value
cosine_accuracy@1 0.0783
cosine_accuracy@3 0.2174
cosine_accuracy@5 0.3435
cosine_accuracy@10 0.5696
cosine_precision@1 0.0783
cosine_precision@3 0.0725
cosine_precision@5 0.0687
cosine_precision@10 0.057
cosine_recall@1 0.0783
cosine_recall@3 0.2174
cosine_recall@5 0.3435
cosine_recall@10 0.5696
cosine_ndcg@10 0.2812
cosine_mrr@10 0.1947
cosine_map@100 0.2122

Information Retrieval

Metric Value
cosine_accuracy@1 0.0522
cosine_accuracy@3 0.2087
cosine_accuracy@5 0.3174
cosine_accuracy@10 0.513
cosine_precision@1 0.0522
cosine_precision@3 0.0696
cosine_precision@5 0.0635
cosine_precision@10 0.0513
cosine_recall@1 0.0522
cosine_recall@3 0.2087
cosine_recall@5 0.3174
cosine_recall@10 0.513
cosine_ndcg@10 0.2483
cosine_mrr@10 0.1679
cosine_map@100 0.1893

Information Retrieval

Metric Value
cosine_accuracy@1 0.0565
cosine_accuracy@3 0.2261
cosine_accuracy@5 0.3261
cosine_accuracy@10 0.5435
cosine_precision@1 0.0565
cosine_precision@3 0.0754
cosine_precision@5 0.0652
cosine_precision@10 0.0543
cosine_recall@1 0.0565
cosine_recall@3 0.2261
cosine_recall@5 0.3261
cosine_recall@10 0.5435
cosine_ndcg@10 0.2661
cosine_mrr@10 0.182
cosine_map@100 0.2004

Information Retrieval

Metric Value
cosine_accuracy@1 0.0565
cosine_accuracy@3 0.2174
cosine_accuracy@5 0.3174
cosine_accuracy@10 0.5435
cosine_precision@1 0.0565
cosine_precision@3 0.0725
cosine_precision@5 0.0635
cosine_precision@10 0.0543
cosine_recall@1 0.0565
cosine_recall@3 0.2174
cosine_recall@5 0.3174
cosine_recall@10 0.5435
cosine_ndcg@10 0.2641
cosine_mrr@10 0.1797
cosine_map@100 0.1971

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 5,520 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 1000 samples:
    positive anchor
    type string string
    details
    • min: 5 tokens
    • mean: 43.78 tokens
    • max: 117 tokens
    • min: 9 tokens
    • mean: 20.5 tokens
    • max: 51 tokens
  • Samples:
    positive anchor
    L’Ajuntament vol crear un banc de recursos on recollir tots els oferiments de la població i que servirà per atendre les necessitats de les famílies refugiades acollides al poble. Quin és el paper de l’Ajuntament en la integració de les persones refugiades acollides?
    Aquest tipus d'actuació requereix la intervenció d'una persona tècnica competent que subscrigui el projecte o la documentació tècnica corresponent i que assumeixi la direcció facultativa de l'execució de les obres. Quin és el requisit per a la intervenció d'una persona tècnica competent en les obres d'intervenció parcial interior en edificis amb elements catalogats?
    Aquest títol, adreçat a persones empadronades a Sant Quirze del Vallès, es concedirà segons el nivell d’ingressos, la condició d’edat o de discapacitat, en base als criteris específics que recull l’ordenança reguladora del sistema de tarifació social del transport públic municipal en autobús a Sant Quirze del Vallès. Quin és el benefici de la TBUS GRATUÏTA per a les persones majors?
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024,
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 10
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.2
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_1024_cosine_map@100 dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0.4638 10 4.0375 - - - - - -
0.9275 20 3.2095 - - - - - -
0.9739 21 - 0.1772 0.1818 0.1967 0.1911 0.1417 0.1750
1.3913 30 2.1843 - - - - - -
1.8551 40 1.6095 - - - - - -
1.9942 43 - 0.1889 0.1676 0.1961 0.1969 0.1834 0.1899
2.3188 50 1.2099 - - - - - -
2.7826 60 0.909 - - - - - -
2.9681 64 - 0.1998 0.1977 0.2164 0.2030 0.1972 0.2156
3.2464 70 0.7534 - - - - - -
3.7101 80 0.6339 - - - - - -
3.9884 86 - 0.2049 0.2024 0.1989 0.1935 0.2046 0.1949
4.1739 90 0.5423 - - - - - -
4.6377 100 0.5135 - - - - - -
4.9623 107 - 0.1967 0.2199 0.1892 0.2113 0.1957 0.2037
5.1014 110 0.4563 - - - - - -
5.5652 120 0.3837 - - - - - -
5.9826 129 - 0.2026 0.1898 0.1903 0.2035 0.2034 0.2187
6.0290 130 0.3991 - - - - - -
6.4928 140 0.3996 - - - - - -
6.9565 150 0.3225 0.2053 0.1866 0.2046 0.2083 0.1822 0.2086
7.4203 160 0.3407 - - - - - -
7.8841 170 0.2982 - - - - - -
7.9768 172 - 0.2092 0.2197 0.2005 0.2178 0.2063 0.2042
8.3478 180 0.3169 - - - - - -
8.8116 190 0.2799 - - - - - -
8.9971 194 - 0.2053 0.2215 0.1929 0.2191 0.2106 0.2170
9.2754 200 0.312 - - - - - -
9.7391 210 0.2684 0.1876 0.2004 0.1893 0.2122 0.1971 0.1934
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.2
  • PyTorch: 2.4.1+cu121
  • Accelerate: 0.35.0.dev0
  • Datasets: 3.0.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}