jvanhoof's picture
Upload folder using huggingface_hub
fcf3cb4 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:18240762
  - loss:MSELoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
  - source_sentence: Yeah, fire in the park, let's go!
    sentences:
      - 午前2時頃に音楽が止まり、それから熟睡。
      - 彼はニンジンが好きではないので、食べなかった。
      - 公園のライトアップ、ぜひ行こうね!
  - source_sentence: Population is around 5.7 million people.
    sentences:
      - 人口は約570万人です。
      - カンドンベの音楽はcuerdaと呼ばれるドラマーのグループによって演奏される。
      - 'シノプシス: 2116年—日本政府はシビルシステムの無人ドローンロボットを問題のある国に輸出し始め、システムは世界中に広がっています。'
  - source_sentence: >-
      With EMUI 5.0, the Huawei Mate 9 becomes more intelligent and efficient
      over time by understanding consumers’ behaviour patterns and ensures the
      highest priority applications are given preference subject to system
      resources.
    sentences:
      - 私も今はクルマを持っていません。
      - ガジュマルの樹を見に行きたいです。
      - >-
        EMUI5.0では、『HUAWEI Mate
        9』が消費者の行動パターンを理解し、時間をかけて知能と効率を上げ、優先順位の最も高いアプリをシステム消費源の対象に優先される事を保証します。
  - source_sentence: >-
      What are the differences between the environments and geographical
      positions of the East and the West?
    sentences:
      - 環境と地理的位置に関して、東洋と西洋の相違点は何であろうか。
      - >-
        その ​ ほか ​ に , “心霊 ​ 手術 ​ 師 ” が ​ おり , この ​ 人 ​ たち ​ は“ 心霊 ​ 手術 ” なる ​ もの
        ​ を ​ 行ない ​ ます。
      - Numpy  import できない。
  - source_sentence: Jesus Christ did surrender his life for the “sheep. 
    sentences:
      - フィリポは読んでいる事柄が分かりますかと尋ねた。
      - イエス    キリスト    ご自分      を「羊」の  ため    捨て  まし  た。
      - 彼はこの金を中央政府には渡そうとしない。
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: stsb multi mt en
          type: stsb_multi_mt-en
        metrics:
          - type: pearson_cosine
            value: 0.7901750255742193
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.793832704547488
            name: Spearman Cosine
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: JSTS
          type: JSTS
        metrics:
          - type: pearson_cosine
            value: 0.8562933057524594
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8081744506827298
            name: Spearman Cosine

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Jesus Christ did surrender his life for the “sheep. ”',
    'イエス \u200b ・ \u200b キリスト \u200b は \u200b ご自分 \u200b の \u200b 命 \u200b を「羊」の \u200b ため \u200b に \u200b 捨て \u200b まし \u200b た。',
    '彼はこの金を中央政府には渡そうとしない。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric stsb_multi_mt-en JSTS
pearson_cosine 0.7902 0.8563
spearman_cosine 0.7938 0.8082

Training Details

Training Dataset

Unnamed Dataset

  • Size: 18,240,762 training samples
  • Columns: english, non_english, and label
  • Approximate statistics based on the first 1000 samples:
    english non_english label
    type string string list
    details
    • min: 4 tokens
    • mean: 15.99 tokens
    • max: 128 tokens
    • min: 4 tokens
    • mean: 21.59 tokens
    • max: 128 tokens
    • size: 384 elements
  • Samples:
    english non_english label
    Slow to Mars? 火星しばり? [-0.1292940022648608, -0.1167307527589221, -0.008499974779641976, 0.04317784529767997, -0.06141806471633044, ...]
    Sunset is nearly there. サンクスはすぐそこだし。 [-0.1347740689698337, 0.053288680755846106, 0.014359346388162629, 0.0157641416547634, 0.0900218121125077, ...]
    Why were these Christians put to death? ハンガリー ​ の ​ 新聞「バシュ ​ ・ ​ ナーペ」は ​ 次 ​ の ​ よう ​ に ​ 説明 ​ し ​ て ​ い ​ ます。「 [0.09746742956653999, -0.006846877375759926, -0.03973075126221857, 0.024986338940603363, -0.021140928354124164, ...]
  • Loss: MSELoss

Evaluation Dataset

Unnamed Dataset

  • Size: 184,251 evaluation samples
  • Columns: english, non_english, and label
  • Approximate statistics based on the first 1000 samples:
    english non_english label
    type string string list
    details
    • min: 4 tokens
    • mean: 16.16 tokens
    • max: 116 tokens
    • min: 4 tokens
    • mean: 21.65 tokens
    • max: 128 tokens
    • size: 384 elements
  • Samples:
    english non_english label
    Back from donating? ドーナツ回? [-0.14056862827741115, -0.09391276023432168, 0.011405737148041988, 0.012085375305688852, -0.056379213184557624, ...]
    134)Textbooks were also in short supply. 3)荷物の引き渡しも短時間にテキパキとされていました。 [0.04401202896633807, 0.07403046630916377, 0.11568493170920714, 0.047522982370575784, 0.1009405093401555, ...]
    The COG investigators started the trial by providing dosages of crizotinib to their patients that were lower than those used in adults with NSCLC. COG試験責任医師らは、NSCLCの成人患者で使用されている投与量より少ない量のcrizotinibを小児患者に提供することで試験を開始した。 [0.21476626448171793, -0.04704800523318936, 0.061019190603563075, 0.027317017405848458, -0.03788587912458321, ...]
  • Loss: MSELoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • gradient_accumulation_steps: 2
  • learning_rate: 0.0003
  • num_train_epochs: 8
  • warmup_ratio: 0.15
  • bf16: True
  • dataloader_num_workers: 8

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0003
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 8
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.15
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 8
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss stsb_multi_mt-en_spearman_cosine JSTS_spearman_cosine
0.0281 500 0.0058 - - -
0.0561 1000 0.0051 - - -
0.0842 1500 0.0048 - - -
0.1123 2000 0.0046 0.0022 0.2515 0.2726
0.1403 2500 0.0044 - - -
0.1684 3000 0.0043 - - -
0.1965 3500 0.0041 - - -
0.2245 4000 0.004 0.0019 0.4462 0.4910
0.2526 4500 0.0039 - - -
0.2807 5000 0.0038 - - -
0.3088 5500 0.0037 - - -
0.3368 6000 0.0036 0.0017 0.5792 0.6327
0.3649 6500 0.0035 - - -
0.3930 7000 0.0034 - - -
0.4210 7500 0.0033 - - -
0.4491 8000 0.0032 0.0015 0.6501 0.7016
0.4772 8500 0.0032 - - -
0.5052 9000 0.0031 - - -
0.5333 9500 0.0031 - - -
0.5614 10000 0.0031 0.0015 0.6939 0.7373
0.5894 10500 0.003 - - -
0.6175 11000 0.003 - - -
0.6456 11500 0.003 - - -
0.6736 12000 0.0029 0.0014 0.7043 0.7573
0.7017 12500 0.0029 - - -
0.7298 13000 0.0029 - - -
0.7579 13500 0.0029 - - -
0.7859 14000 0.0029 0.0014 0.7221 0.7642
0.8140 14500 0.0028 - - -
0.8421 15000 0.0028 - - -
0.8701 15500 0.0028 - - -
0.8982 16000 0.0028 0.0013 0.7400 0.7763
0.9263 16500 0.0028 - - -
0.9543 17000 0.0028 - - -
0.9824 17500 0.0028 - - -
1.0104 18000 0.0027 0.0013 0.7459 0.7796
1.0385 18500 0.0027 - - -
1.0666 19000 0.0027 - - -
1.0946 19500 0.0027 - - -
1.1227 20000 0.0027 0.0013 0.7620 0.7853
1.1508 20500 0.0027 - - -
1.1789 21000 0.0027 - - -
1.2069 21500 0.0027 - - -
1.2350 22000 0.0027 0.0013 0.7669 0.7848
1.2631 22500 0.0027 - - -
1.2911 23000 0.0027 - - -
1.3192 23500 0.0026 - - -
1.3473 24000 0.0026 0.0013 0.7633 0.7866
1.3753 24500 0.0026 - - -
1.4034 25000 0.0026 - - -
1.4315 25500 0.0026 - - -
1.4595 26000 0.0026 0.0013 0.7751 0.7892
1.4876 26500 0.0026 - - -
1.5157 27000 0.0026 - - -
1.5437 27500 0.0026 - - -
1.5718 28000 0.0026 0.0012 0.7751 0.7951
1.5999 28500 0.0026 - - -
1.6280 29000 0.0026 - - -
1.6560 29500 0.0026 - - -
1.6841 30000 0.0026 0.0012 0.7765 0.7957
1.7122 30500 0.0026 - - -
1.7402 31000 0.0026 - - -
1.7683 31500 0.0026 - - -
1.7964 32000 0.0026 0.0012 0.7805 0.7957
1.8244 32500 0.0026 - - -
1.8525 33000 0.0026 - - -
1.8806 33500 0.0026 - - -
1.9086 34000 0.0026 0.0012 0.7797 0.7958
1.9367 34500 0.0026 - - -
1.9648 35000 0.0026 - - -
1.9928 35500 0.0026 - - -
2.0209 36000 0.0025 0.0012 0.7792 0.7943
2.0490 36500 0.0025 - - -
2.0770 37000 0.0025 - - -
2.1051 37500 0.0025 - - -
2.1332 38000 0.0025 0.0012 0.7831 0.7943
2.1612 38500 0.0025 - - -
2.1893 39000 0.0025 - - -
2.2174 39500 0.0025 - - -
2.2454 40000 0.0025 0.0012 0.7834 0.7973
2.2735 40500 0.0025 - - -
2.3016 41000 0.0025 - - -
2.3296 41500 0.0025 - - -
2.3577 42000 0.0025 0.0012 0.7860 0.7988
2.3858 42500 0.0025 - - -
2.4138 43000 0.0025 - - -
2.4419 43500 0.0025 - - -
2.4700 44000 0.0025 0.0012 0.7867 0.8006
2.4980 44500 0.0025 - - -
2.5261 45000 0.0025 - - -
2.5542 45500 0.0025 - - -
2.5823 46000 0.0025 0.0012 0.7870 0.8009
2.6103 46500 0.0025 - - -
2.6384 47000 0.0025 - - -
2.6665 47500 0.0025 - - -
2.6945 48000 0.0025 0.0012 0.7852 0.8019
2.7226 48500 0.0025 - - -
2.7507 49000 0.0025 - - -
2.7787 49500 0.0025 - - -
2.8068 50000 0.0025 0.0012 0.7863 0.8018
2.8349 50500 0.0025 - - -
2.8629 51000 0.0025 - - -
2.8910 51500 0.0025 - - -
2.9191 52000 0.0025 0.0012 0.7874 0.8000
2.9471 52500 0.0025 - - -
2.9752 53000 0.0025 - - -
3.0033 53500 0.0025 - - -
3.0313 54000 0.0025 0.0012 0.7875 0.8007
3.0594 54500 0.0025 - - -
3.0875 55000 0.0025 - - -
3.1155 55500 0.0025 - - -
3.1436 56000 0.0025 0.0012 0.7899 0.8021
3.1717 56500 0.0025 - - -
3.1997 57000 0.0025 - - -
3.2278 57500 0.0025 - - -
3.2559 58000 0.0025 0.0012 0.7914 0.8014
3.2839 58500 0.0025 - - -
3.3120 59000 0.0025 - - -
3.3401 59500 0.0025 - - -
3.3681 60000 0.0025 0.0012 0.7860 0.8029
3.3962 60500 0.0025 - - -
3.4243 61000 0.0025 - - -
3.4524 61500 0.0025 - - -
3.4804 62000 0.0025 0.0012 0.7886 0.8023
3.5085 62500 0.0025 - - -
3.5366 63000 0.0025 - - -
3.5646 63500 0.0025 - - -
3.5927 64000 0.0025 0.0012 0.7891 0.8045
3.6208 64500 0.0025 - - -
3.6488 65000 0.0025 - - -
3.6769 65500 0.0025 - - -
3.7050 66000 0.0025 0.0012 0.7892 0.8042
3.7330 66500 0.0025 - - -
3.7611 67000 0.0025 - - -
3.7892 67500 0.0025 - - -
3.8172 68000 0.0025 0.0012 0.7881 0.8042
3.8453 68500 0.0025 - - -
3.8734 69000 0.0025 - - -
3.9015 69500 0.0025 - - -
3.9295 70000 0.0025 0.0012 0.7905 0.8038
3.9576 70500 0.0025 - - -
3.9857 71000 0.0025 - - -
4.0137 71500 0.0025 - - -
4.0418 72000 0.0025 0.0012 0.7900 0.8052
4.0698 72500 0.0025 - - -
4.0979 73000 0.0025 - - -
4.1260 73500 0.0025 - - -
4.1540 74000 0.0025 0.0012 0.7904 0.8058
4.1821 74500 0.0025 - - -
4.2102 75000 0.0025 - - -
4.2382 75500 0.0025 - - -
4.2663 76000 0.0025 0.0012 0.7873 0.8049
4.2944 76500 0.0025 - - -
4.3225 77000 0.0025 - - -
4.3505 77500 0.0025 - - -
4.3786 78000 0.0025 0.0012 0.7908 0.8064
4.4067 78500 0.0025 - - -
4.4347 79000 0.0025 - - -
4.4628 79500 0.0025 - - -
4.4909 80000 0.0025 0.0012 0.7894 0.8050
4.5189 80500 0.0025 - - -
4.5470 81000 0.0025 - - -
4.5751 81500 0.0025 - - -
4.6031 82000 0.0025 0.0012 0.7917 0.8075
4.6312 82500 0.0025 - - -
4.6593 83000 0.0025 - - -
4.6873 83500 0.0025 - - -
4.7154 84000 0.0025 0.0012 0.7914 0.8059
4.7435 84500 0.0025 - - -
4.7715 85000 0.0025 - - -
4.7996 85500 0.0025 - - -
4.8277 86000 0.0025 0.0012 0.7895 0.8056
4.8558 86500 0.0025 - - -
4.8838 87000 0.0025 - - -
4.9119 87500 0.0025 - - -
4.9400 88000 0.0025 0.0012 0.7904 0.8059
4.9680 88500 0.0025 - - -
4.9961 89000 0.0025 - - -
5.0241 89500 0.0025 - - -
5.0522 90000 0.0025 0.0012 0.7907 0.8055
5.0803 90500 0.0025 - - -
5.1083 91000 0.0025 - - -
5.1364 91500 0.0025 - - -
5.1645 92000 0.0025 0.0012 0.7912 0.8056
5.1926 92500 0.0025 - - -
5.2206 93000 0.0025 - - -
5.2487 93500 0.0024 - - -
5.2768 94000 0.0025 0.0012 0.7913 0.8045
5.3048 94500 0.0025 - - -
5.3329 95000 0.0024 - - -
5.3610 95500 0.0024 - - -
5.3890 96000 0.0024 0.0012 0.7922 0.8056
5.4171 96500 0.0024 - - -
5.4452 97000 0.0024 - - -
5.4732 97500 0.0024 - - -
5.5013 98000 0.0024 0.0012 0.7909 0.8056
5.5294 98500 0.0024 - - -
5.5574 99000 0.0024 - - -
5.5855 99500 0.0024 - - -
5.6136 100000 0.0024 0.0012 0.7912 0.8075
5.6416 100500 0.0024 - - -
5.6697 101000 0.0024 - - -
5.6978 101500 0.0024 - - -
5.7259 102000 0.0024 0.0012 0.7921 0.8066
5.7539 102500 0.0024 - - -
5.7820 103000 0.0024 - - -
5.8101 103500 0.0024 - - -
5.8381 104000 0.0024 0.0012 0.7923 0.8068
5.8662 104500 0.0024 - - -
5.8943 105000 0.0024 - - -
5.9223 105500 0.0024 - - -
5.9504 106000 0.0024 0.0012 0.7941 0.8070
5.9785 106500 0.0024 - - -
6.0065 107000 0.0024 - - -
6.0346 107500 0.0024 - - -
6.0626 108000 0.0024 0.0012 0.7922 0.8078
6.0907 108500 0.0024 - - -
6.1188 109000 0.0024 - - -
6.1469 109500 0.0024 - - -
6.1749 110000 0.0024 0.0012 0.7922 0.8064
6.2030 110500 0.0024 - - -
6.2311 111000 0.0024 - - -
6.2591 111500 0.0024 - - -
6.2872 112000 0.0024 0.0012 0.7915 0.8069
6.3153 112500 0.0024 - - -
6.3433 113000 0.0024 - - -
6.3714 113500 0.0024 - - -
6.3995 114000 0.0024 0.0012 0.7921 0.8079
6.4275 114500 0.0024 - - -
6.4556 115000 0.0024 - - -
6.4837 115500 0.0024 - - -
6.5117 116000 0.0024 0.0012 0.7915 0.8071
6.5398 116500 0.0024 - - -
6.5679 117000 0.0024 - - -
6.5960 117500 0.0024 - - -
6.6240 118000 0.0024 0.0012 0.7943 0.8081
6.6521 118500 0.0024 - - -
6.6802 119000 0.0024 - - -
6.7082 119500 0.0024 - - -
6.7363 120000 0.0024 0.0012 0.7946 0.8079
6.7644 120500 0.0024 - - -
6.7924 121000 0.0024 - - -
6.8205 121500 0.0024 - - -
6.8486 122000 0.0024 0.0012 0.7919 0.8077
6.8766 122500 0.0024 - - -
6.9047 123000 0.0024 - - -
6.9328 123500 0.0024 - - -
6.9608 124000 0.0024 0.0012 0.7950 0.8087
6.9889 124500 0.0024 - - -
7.0170 125000 0.0024 - - -
7.0450 125500 0.0024 - - -
7.0731 126000 0.0024 0.0012 0.7927 0.8081
7.1012 126500 0.0024 - - -
7.1292 127000 0.0024 - - -
7.1573 127500 0.0024 - - -
7.1854 128000 0.0024 0.0012 0.7945 0.8082
7.2134 128500 0.0024 - - -
7.2415 129000 0.0024 - - -
7.2696 129500 0.0024 - - -
7.2976 130000 0.0024 0.0012 0.7927 0.8074
7.3257 130500 0.0024 - - -
7.3538 131000 0.0024 - - -
7.3818 131500 0.0024 - - -
7.4099 132000 0.0024 0.0012 0.7924 0.8077
7.4380 132500 0.0024 - - -
7.4661 133000 0.0024 - - -
7.4941 133500 0.0024 - - -
7.5222 134000 0.0024 0.0012 0.7929 0.8082
7.5503 134500 0.0024 - - -
7.5783 135000 0.0024 - - -
7.6064 135500 0.0024 - - -
7.6345 136000 0.0024 0.0012 0.7937 0.8080
7.6625 136500 0.0024 - - -
7.6906 137000 0.0024 - - -
7.7187 137500 0.0024 - - -
7.7467 138000 0.0024 0.0012 0.7941 0.8083
7.7748 138500 0.0024 - - -
7.8029 139000 0.0024 - - -
7.8309 139500 0.0024 - - -
7.8590 140000 0.0024 0.0012 0.7943 0.8082
7.8871 140500 0.0024 - - -
7.9151 141000 0.0024 - - -
7.9432 141500 0.0024 - - -
7.9713 142000 0.0024 0.0012 0.7938 0.8082
7.9994 142500 0.0024 - - -

Framework Versions

  • Python: 3.10.16
  • Sentence Transformers: 3.3.1
  • Transformers: 4.51.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MSELoss

@inproceedings{reimers-2020-multilingual-sentence-bert,
    title = "Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2020",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2004.09813",
}