--- tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:18240762 - loss:MSELoss base_model: sentence-transformers/all-MiniLM-L6-v2 widget: - source_sentence: Yeah, fire in the park, let's go! sentences: - 午前2時頃に音楽が止まり、それから熟睡。 - 彼はニンジンが好きではないので、食べなかった。 - 公園のライトアップ、ぜひ行こうね! - source_sentence: Population is around 5.7 million people. sentences: - 人口は約570万人です。 - カンドンベの音楽はcuerdaと呼ばれるドラマーのグループによって演奏される。 - 'シノプシス: 2116年—日本政府はシビルシステムの無人ドローンロボットを問題のある国に輸出し始め、システムは世界中に広がっています。' - source_sentence: With EMUI 5.0, the Huawei Mate 9 becomes more intelligent and efficient over time by understanding consumers’ behaviour patterns and ensures the highest priority applications are given preference subject to system resources. sentences: - 私も今はクルマを持っていません。 - ガジュマルの樹を見に行きたいです。 - EMUI5.0では、『HUAWEI Mate 9』が消費者の行動パターンを理解し、時間をかけて知能と効率を上げ、優先順位の最も高いアプリをシステム消費源の対象に優先される事を保証します。 - source_sentence: What are the differences between the environments and geographical positions of the East and the West? sentences: - 環境と地理的位置に関して、東洋と西洋の相違点は何であろうか。 - その ​ ほか ​ に , “心霊 ​ 手術 ​ 師 ” が ​ おり , この ​ 人 ​ たち ​ は“ 心霊 ​ 手術 ” なる ​ もの ​ を ​ 行ない ​ ます。 - Numpy を import できない。 - source_sentence: Jesus Christ did surrender his life for the “sheep. ” sentences: - フィリポは読んでいる事柄が分かりますかと尋ねた。 - イエス ​ ・ ​ キリスト ​ は ​ ご自分 ​ の ​ 命 ​ を「羊」の ​ ため ​ に ​ 捨て ​ まし ​ た。 - 彼はこの金を中央政府には渡そうとしない。 pipeline_tag: sentence-similarity library_name: sentence-transformers metrics: - pearson_cosine - spearman_cosine model-index: - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2 results: - task: type: semantic-similarity name: Semantic Similarity dataset: name: stsb multi mt en type: stsb_multi_mt-en metrics: - type: pearson_cosine value: 0.7988037559289333 name: Pearson Cosine - type: spearman_cosine value: 0.8009711557760016 name: Spearman Cosine - task: type: semantic-similarity name: Semantic Similarity dataset: name: JSTS type: JSTS metrics: - type: pearson_cosine value: 0.8622404113206219 name: Pearson Cosine - type: spearman_cosine value: 0.8142666349859583 name: Spearman Cosine --- # SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) - **Maximum Sequence Length:** 128 tokens - **Output Dimensionality:** 384 dimensions - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("sentence_transformers_model_id") # Run inference sentences = [ 'Jesus Christ did surrender his life for the “sheep. ”', 'イエス \u200b ・ \u200b キリスト \u200b は \u200b ご自分 \u200b の \u200b 命 \u200b を「羊」の \u200b ため \u200b に \u200b 捨て \u200b まし \u200b た。', '彼はこの金を中央政府には渡そうとしない。', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 384] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Semantic Similarity * Datasets: `stsb_multi_mt-en` and `JSTS` * Evaluated with [EmbeddingSimilarityEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) | Metric | stsb_multi_mt-en | JSTS | |:--------------------|:-----------------|:-----------| | pearson_cosine | 0.7988 | 0.8622 | | **spearman_cosine** | **0.801** | **0.8143** | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 18,240,762 training samples * Columns: english, non_english, and label * Approximate statistics based on the first 1000 samples: | | english | non_english | label | |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-------------------------------------| | type | string | string | list | | details | | | | * Samples: | english | non_english | label | |:-----------------------------------------------------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------| | Slow to Mars? | 火星しばり? | [-0.1292940022648608, -0.1167307527589221, -0.008499974779641976, 0.04317784529767997, -0.06141806471633044, ...] | | Sunset is nearly there. | サンクスはすぐそこだし。 | [-0.1347740689698337, 0.053288680755846106, 0.014359346388162629, 0.0157641416547634, 0.0900218121125077, ...] | | Why were these Christians put to death? | ハンガリー ​ の ​ 新聞「バシュ ​ ・ ​ ナーペ」は ​ 次 ​ の ​ よう ​ に ​ 説明 ​ し ​ て ​ い ​ ます。「 | [0.09746742956653999, -0.006846877375759926, -0.03973075126221857, 0.024986338940603363, -0.021140928354124164, ...] | * Loss: [MSELoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss) ### Evaluation Dataset #### Unnamed Dataset * Size: 184,251 evaluation samples * Columns: english, non_english, and label * Approximate statistics based on the first 1000 samples: | | english | non_english | label | |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-------------------------------------| | type | string | string | list | | details | | | | * Samples: | english | non_english | label | |:----------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------| | Back from donating? | ドーナツ回? | [-0.14056862827741115, -0.09391276023432168, 0.011405737148041988, 0.012085375305688852, -0.056379213184557624, ...] | | 134)Textbooks were also in short supply. | 3)荷物の引き渡しも短時間にテキパキとされていました。 | [0.04401202896633807, 0.07403046630916377, 0.11568493170920714, 0.047522982370575784, 0.1009405093401555, ...] | | The COG investigators started the trial by providing dosages of crizotinib to their patients that were lower than those used in adults with NSCLC. | COG試験責任医師らは、NSCLCの成人患者で使用されている投与量より少ない量のcrizotinibを小児患者に提供することで試験を開始した。 | [0.21476626448171793, -0.04704800523318936, 0.061019190603563075, 0.027317017405848458, -0.03788587912458321, ...] | * Loss: [MSELoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss) ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: steps - `per_device_train_batch_size`: 512 - `per_device_eval_batch_size`: 512 - `gradient_accumulation_steps`: 2 - `learning_rate`: 0.0003 - `num_train_epochs`: 8 - `warmup_ratio`: 0.15 - `bf16`: True - `dataloader_num_workers`: 8 #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: steps - `prediction_loss_only`: True - `per_device_train_batch_size`: 512 - `per_device_eval_batch_size`: 512 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 2 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 0.0003 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 8 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.15 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: True - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 8 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `tp_size`: 0 - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: None - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: proportional
### Training Logs
Click to expand | Epoch | Step | Training Loss | Validation Loss | stsb_multi_mt-en_spearman_cosine | JSTS_spearman_cosine | |:------:|:------:|:-------------:|:---------------:|:--------------------------------:|:--------------------:| | 0.0281 | 500 | 0.0057 | - | - | - | | 0.0561 | 1000 | 0.005 | - | - | - | | 0.0842 | 1500 | 0.0047 | - | - | - | | 0.1123 | 2000 | 0.0045 | 0.0022 | 0.2757 | 0.2805 | | 0.1403 | 2500 | 0.0043 | - | - | - | | 0.1684 | 3000 | 0.0042 | - | - | - | | 0.1965 | 3500 | 0.004 | - | - | - | | 0.2245 | 4000 | 0.0039 | 0.0019 | 0.4951 | 0.5122 | | 0.2526 | 4500 | 0.0037 | - | - | - | | 0.2807 | 5000 | 0.0036 | - | - | - | | 0.3088 | 5500 | 0.0035 | - | - | - | | 0.3368 | 6000 | 0.0034 | 0.0016 | 0.6060 | 0.6544 | | 0.3649 | 6500 | 0.0033 | - | - | - | | 0.3930 | 7000 | 0.0032 | - | - | - | | 0.4210 | 7500 | 0.0032 | - | - | - | | 0.4491 | 8000 | 0.0031 | 0.0015 | 0.6802 | 0.7234 | | 0.4772 | 8500 | 0.003 | - | - | - | | 0.5052 | 9000 | 0.003 | - | - | - | | 0.5333 | 9500 | 0.003 | - | - | - | | 0.5614 | 10000 | 0.0029 | 0.0014 | 0.7144 | 0.7537 | | 0.5894 | 10500 | 0.0029 | - | - | - | | 0.6175 | 11000 | 0.0029 | - | - | - | | 0.6456 | 11500 | 0.0028 | - | - | - | | 0.6736 | 12000 | 0.0028 | 0.0014 | 0.7260 | 0.7691 | | 0.7017 | 12500 | 0.0028 | - | - | - | | 0.7298 | 13000 | 0.0028 | - | - | - | | 0.7579 | 13500 | 0.0027 | - | - | - | | 0.7859 | 14000 | 0.0027 | 0.0013 | 0.7396 | 0.7751 | | 0.8140 | 14500 | 0.0027 | - | - | - | | 0.8421 | 15000 | 0.0027 | - | - | - | | 0.8701 | 15500 | 0.0027 | - | - | - | | 0.8982 | 16000 | 0.0027 | 0.0013 | 0.7499 | 0.7793 | | 0.9263 | 16500 | 0.0027 | - | - | - | | 0.9543 | 17000 | 0.0027 | - | - | - | | 0.9824 | 17500 | 0.0026 | - | - | - | | 1.0104 | 18000 | 0.0026 | 0.0013 | 0.7542 | 0.7847 | | 1.0385 | 18500 | 0.0026 | - | - | - | | 1.0666 | 19000 | 0.0026 | - | - | - | | 1.0946 | 19500 | 0.0026 | - | - | - | | 1.1227 | 20000 | 0.0026 | 0.0013 | 0.7685 | 0.7883 | | 1.1508 | 20500 | 0.0026 | - | - | - | | 1.1789 | 21000 | 0.0026 | - | - | - | | 1.2069 | 21500 | 0.0026 | - | - | - | | 1.2350 | 22000 | 0.0026 | 0.0012 | 0.7695 | 0.7916 | | 1.2631 | 22500 | 0.0026 | - | - | - | | 1.2911 | 23000 | 0.0026 | - | - | - | | 1.3192 | 23500 | 0.0025 | - | - | - | | 1.3473 | 24000 | 0.0025 | 0.0012 | 0.7698 | 0.7937 | | 1.3753 | 24500 | 0.0025 | - | - | - | | 1.4034 | 25000 | 0.0025 | - | - | - | | 1.4315 | 25500 | 0.0025 | - | - | - | | 1.4595 | 26000 | 0.0025 | 0.0012 | 0.7785 | 0.7951 | | 1.4876 | 26500 | 0.0025 | - | - | - | | 1.5157 | 27000 | 0.0025 | - | - | - | | 1.5437 | 27500 | 0.0025 | - | - | - | | 1.5718 | 28000 | 0.0025 | 0.0012 | 0.7798 | 0.7995 | | 1.5999 | 28500 | 0.0025 | - | - | - | | 1.6280 | 29000 | 0.0025 | - | - | - | | 1.6560 | 29500 | 0.0025 | - | - | - | | 1.6841 | 30000 | 0.0025 | 0.0012 | 0.7821 | 0.7985 | | 1.7122 | 30500 | 0.0025 | - | - | - | | 1.7402 | 31000 | 0.0025 | - | - | - | | 1.7683 | 31500 | 0.0025 | - | - | - | | 1.7964 | 32000 | 0.0025 | 0.0012 | 0.7860 | 0.7999 | | 1.8244 | 32500 | 0.0025 | - | - | - | | 1.8525 | 33000 | 0.0025 | - | - | - | | 1.8806 | 33500 | 0.0025 | - | - | - | | 1.9086 | 34000 | 0.0025 | 0.0012 | 0.7859 | 0.8009 | | 1.9367 | 34500 | 0.0025 | - | - | - | | 1.9648 | 35000 | 0.0025 | - | - | - | | 1.9928 | 35500 | 0.0025 | - | - | - | | 2.0209 | 36000 | 0.0025 | 0.0012 | 0.7840 | 0.8000 | | 2.0490 | 36500 | 0.0025 | - | - | - | | 2.0770 | 37000 | 0.0025 | - | - | - | | 2.1051 | 37500 | 0.0025 | - | - | - | | 2.1332 | 38000 | 0.0025 | 0.0012 | 0.7882 | 0.8029 | | 2.1612 | 38500 | 0.0025 | - | - | - | | 2.1893 | 39000 | 0.0025 | - | - | - | | 2.2174 | 39500 | 0.0025 | - | - | - | | 2.2454 | 40000 | 0.0025 | 0.0012 | 0.7867 | 0.8030 | | 2.2735 | 40500 | 0.0025 | - | - | - | | 2.3016 | 41000 | 0.0025 | - | - | - | | 2.3296 | 41500 | 0.0025 | - | - | - | | 2.3577 | 42000 | 0.0025 | 0.0012 | 0.7909 | 0.8044 | | 2.3858 | 42500 | 0.0025 | - | - | - | | 2.4138 | 43000 | 0.0025 | - | - | - | | 2.4419 | 43500 | 0.0024 | - | - | - | | 2.4700 | 44000 | 0.0024 | 0.0012 | 0.7925 | 0.8047 | | 2.4980 | 44500 | 0.0024 | - | - | - | | 2.5261 | 45000 | 0.0024 | - | - | - | | 2.5542 | 45500 | 0.0024 | - | - | - | | 2.5823 | 46000 | 0.0024 | 0.0012 | 0.7945 | 0.8081 | | 2.6103 | 46500 | 0.0024 | - | - | - | | 2.6384 | 47000 | 0.0024 | - | - | - | | 2.6665 | 47500 | 0.0024 | - | - | - | | 2.6945 | 48000 | 0.0024 | 0.0012 | 0.7918 | 0.8071 | | 2.7226 | 48500 | 0.0024 | - | - | - | | 2.7507 | 49000 | 0.0024 | - | - | - | | 2.7787 | 49500 | 0.0024 | - | - | - | | 2.8068 | 50000 | 0.0024 | 0.0012 | 0.7945 | 0.8063 | | 2.8349 | 50500 | 0.0024 | - | - | - | | 2.8629 | 51000 | 0.0024 | - | - | - | | 2.8910 | 51500 | 0.0024 | - | - | - | | 2.9191 | 52000 | 0.0024 | 0.0012 | 0.7930 | 0.8078 | | 2.9471 | 52500 | 0.0024 | - | - | - | | 2.9752 | 53000 | 0.0024 | - | - | - | | 3.0033 | 53500 | 0.0024 | - | - | - | | 3.0313 | 54000 | 0.0024 | 0.0012 | 0.7947 | 0.8071 | | 3.0594 | 54500 | 0.0024 | - | - | - | | 3.0875 | 55000 | 0.0024 | - | - | - | | 3.1155 | 55500 | 0.0024 | - | - | - | | 3.1436 | 56000 | 0.0024 | 0.0012 | 0.7955 | 0.8077 | | 3.1717 | 56500 | 0.0024 | - | - | - | | 3.1997 | 57000 | 0.0024 | - | - | - | | 3.2278 | 57500 | 0.0024 | - | - | - | | 3.2559 | 58000 | 0.0024 | 0.0012 | 0.7969 | 0.8083 | | 3.2839 | 58500 | 0.0024 | - | - | - | | 3.3120 | 59000 | 0.0024 | - | - | - | | 3.3401 | 59500 | 0.0024 | - | - | - | | 3.3681 | 60000 | 0.0024 | 0.0012 | 0.7916 | 0.8089 | | 3.3962 | 60500 | 0.0024 | - | - | - | | 3.4243 | 61000 | 0.0024 | - | - | - | | 3.4524 | 61500 | 0.0024 | - | - | - | | 3.4804 | 62000 | 0.0024 | 0.0012 | 0.7941 | 0.8092 | | 3.5085 | 62500 | 0.0024 | - | - | - | | 3.5366 | 63000 | 0.0024 | - | - | - | | 3.5646 | 63500 | 0.0024 | - | - | - | | 3.5927 | 64000 | 0.0024 | 0.0012 | 0.7966 | 0.8112 | | 3.6208 | 64500 | 0.0024 | - | - | - | | 3.6488 | 65000 | 0.0024 | - | - | - | | 3.6769 | 65500 | 0.0024 | - | - | - | | 3.7050 | 66000 | 0.0024 | 0.0012 | 0.7957 | 0.8088 | | 3.7330 | 66500 | 0.0024 | - | - | - | | 3.7611 | 67000 | 0.0024 | - | - | - | | 3.7892 | 67500 | 0.0024 | - | - | - | | 3.8172 | 68000 | 0.0024 | 0.0012 | 0.7965 | 0.8104 | | 3.8453 | 68500 | 0.0024 | - | - | - | | 3.8734 | 69000 | 0.0024 | - | - | - | | 3.9015 | 69500 | 0.0024 | - | - | - | | 3.9295 | 70000 | 0.0024 | 0.0012 | 0.7948 | 0.8101 | | 3.9576 | 70500 | 0.0024 | - | - | - | | 3.9857 | 71000 | 0.0024 | - | - | - | | 4.0137 | 71500 | 0.0024 | - | - | - | | 4.0418 | 72000 | 0.0024 | 0.0012 | 0.7985 | 0.8129 | | 4.0698 | 72500 | 0.0024 | - | - | - | | 4.0979 | 73000 | 0.0024 | - | - | - | | 4.1260 | 73500 | 0.0024 | - | - | - | | 4.1540 | 74000 | 0.0024 | 0.0012 | 0.7964 | 0.8114 | | 4.1821 | 74500 | 0.0024 | - | - | - | | 4.2102 | 75000 | 0.0024 | - | - | - | | 4.2382 | 75500 | 0.0024 | - | - | - | | 4.2663 | 76000 | 0.0024 | 0.0012 | 0.7964 | 0.8105 | | 4.2944 | 76500 | 0.0024 | - | - | - | | 4.3225 | 77000 | 0.0024 | - | - | - | | 4.3505 | 77500 | 0.0024 | - | - | - | | 4.3786 | 78000 | 0.0024 | 0.0012 | 0.7975 | 0.8110 | | 4.4067 | 78500 | 0.0024 | - | - | - | | 4.4347 | 79000 | 0.0024 | - | - | - | | 4.4628 | 79500 | 0.0024 | - | - | - | | 4.4909 | 80000 | 0.0024 | 0.0012 | 0.7959 | 0.8113 | | 4.5189 | 80500 | 0.0024 | - | - | - | | 4.5470 | 81000 | 0.0024 | - | - | - | | 4.5751 | 81500 | 0.0024 | - | - | - | | 4.6031 | 82000 | 0.0024 | 0.0012 | 0.7979 | 0.8119 | | 4.6312 | 82500 | 0.0024 | - | - | - | | 4.6593 | 83000 | 0.0024 | - | - | - | | 4.6873 | 83500 | 0.0024 | - | - | - | | 4.7154 | 84000 | 0.0024 | 0.0012 | 0.7980 | 0.8123 | | 4.7435 | 84500 | 0.0024 | - | - | - | | 4.7715 | 85000 | 0.0024 | - | - | - | | 4.7996 | 85500 | 0.0024 | - | - | - | | 4.8277 | 86000 | 0.0024 | 0.0012 | 0.7963 | 0.8118 | | 4.8558 | 86500 | 0.0024 | - | - | - | | 4.8838 | 87000 | 0.0024 | - | - | - | | 4.9119 | 87500 | 0.0024 | - | - | - | | 4.9400 | 88000 | 0.0024 | 0.0012 | 0.7986 | 0.8126 | | 4.9680 | 88500 | 0.0024 | - | - | - | | 4.9961 | 89000 | 0.0024 | - | - | - | | 5.0241 | 89500 | 0.0024 | - | - | - | | 5.0522 | 90000 | 0.0024 | 0.0012 | 0.7994 | 0.8121 | | 5.0803 | 90500 | 0.0024 | - | - | - | | 5.1083 | 91000 | 0.0024 | - | - | - | | 5.1364 | 91500 | 0.0024 | - | - | - | | 5.1645 | 92000 | 0.0024 | 0.0012 | 0.7973 | 0.8120 | | 5.1926 | 92500 | 0.0024 | - | - | - | | 5.2206 | 93000 | 0.0024 | - | - | - | | 5.2487 | 93500 | 0.0024 | - | - | - | | 5.2768 | 94000 | 0.0024 | 0.0012 | 0.7970 | 0.8123 | | 5.3048 | 94500 | 0.0024 | - | - | - | | 5.3329 | 95000 | 0.0024 | - | - | - | | 5.3610 | 95500 | 0.0024 | - | - | - | | 5.3890 | 96000 | 0.0024 | 0.0012 | 0.7997 | 0.8126 | | 5.4171 | 96500 | 0.0024 | - | - | - | | 5.4452 | 97000 | 0.0024 | - | - | - | | 5.4732 | 97500 | 0.0024 | - | - | - | | 5.5013 | 98000 | 0.0024 | 0.0012 | 0.7957 | 0.8114 | | 5.5294 | 98500 | 0.0024 | - | - | - | | 5.5574 | 99000 | 0.0024 | - | - | - | | 5.5855 | 99500 | 0.0024 | - | - | - | | 5.6136 | 100000 | 0.0024 | 0.0012 | 0.7980 | 0.8132 | | 5.6416 | 100500 | 0.0024 | - | - | - | | 5.6697 | 101000 | 0.0024 | - | - | - | | 5.6978 | 101500 | 0.0024 | - | - | - | | 5.7259 | 102000 | 0.0024 | 0.0012 | 0.7984 | 0.8138 | | 5.7539 | 102500 | 0.0024 | - | - | - | | 5.7820 | 103000 | 0.0024 | - | - | - | | 5.8101 | 103500 | 0.0024 | - | - | - | | 5.8381 | 104000 | 0.0024 | 0.0012 | 0.7998 | 0.8134 | | 5.8662 | 104500 | 0.0024 | - | - | - | | 5.8943 | 105000 | 0.0024 | - | - | - | | 5.9223 | 105500 | 0.0024 | - | - | - | | 5.9504 | 106000 | 0.0024 | 0.0012 | 0.8013 | 0.8124 | | 5.9785 | 106500 | 0.0024 | - | - | - | | 6.0065 | 107000 | 0.0024 | - | - | - | | 6.0346 | 107500 | 0.0024 | - | - | - | | 6.0626 | 108000 | 0.0024 | 0.0012 | 0.7987 | 0.8134 | | 6.0907 | 108500 | 0.0024 | - | - | - | | 6.1188 | 109000 | 0.0024 | - | - | - | | 6.1469 | 109500 | 0.0024 | - | - | - | | 6.1749 | 110000 | 0.0024 | 0.0012 | 0.7986 | 0.8127 | | 6.2030 | 110500 | 0.0024 | - | - | - | | 6.2311 | 111000 | 0.0024 | - | - | - | | 6.2591 | 111500 | 0.0024 | - | - | - | | 6.2872 | 112000 | 0.0024 | 0.0012 | 0.7980 | 0.8128 | | 6.3153 | 112500 | 0.0024 | - | - | - | | 6.3433 | 113000 | 0.0024 | - | - | - | | 6.3714 | 113500 | 0.0024 | - | - | - | | 6.3995 | 114000 | 0.0024 | 0.0012 | 0.7980 | 0.8137 | | 6.4275 | 114500 | 0.0024 | - | - | - | | 6.4556 | 115000 | 0.0024 | - | - | - | | 6.4837 | 115500 | 0.0024 | - | - | - | | 6.5117 | 116000 | 0.0024 | 0.0012 | 0.7988 | 0.8129 | | 6.5398 | 116500 | 0.0024 | - | - | - | | 6.5679 | 117000 | 0.0024 | - | - | - | | 6.5960 | 117500 | 0.0024 | - | - | - | | 6.6240 | 118000 | 0.0024 | 0.0012 | 0.8007 | 0.8138 | | 6.6521 | 118500 | 0.0024 | - | - | - | | 6.6802 | 119000 | 0.0024 | - | - | - | | 6.7082 | 119500 | 0.0024 | - | - | - | | 6.7363 | 120000 | 0.0024 | 0.0012 | 0.8019 | 0.8143 | | 6.7644 | 120500 | 0.0024 | - | - | - | | 6.7924 | 121000 | 0.0024 | - | - | - | | 6.8205 | 121500 | 0.0024 | - | - | - | | 6.8486 | 122000 | 0.0024 | 0.0012 | 0.7980 | 0.8137 | | 6.8766 | 122500 | 0.0024 | - | - | - | | 6.9047 | 123000 | 0.0024 | - | - | - | | 6.9328 | 123500 | 0.0024 | - | - | - | | 6.9608 | 124000 | 0.0024 | 0.0012 | 0.8028 | 0.8142 | | 6.9889 | 124500 | 0.0024 | - | - | - | | 7.0170 | 125000 | 0.0024 | - | - | - | | 7.0450 | 125500 | 0.0024 | - | - | - | | 7.0731 | 126000 | 0.0024 | 0.0012 | 0.8002 | 0.8132 | | 7.1012 | 126500 | 0.0024 | - | - | - | | 7.1292 | 127000 | 0.0024 | - | - | - | | 7.1573 | 127500 | 0.0024 | - | - | - | | 7.1854 | 128000 | 0.0024 | 0.0012 | 0.8008 | 0.8137 | | 7.2134 | 128500 | 0.0024 | - | - | - | | 7.2415 | 129000 | 0.0024 | - | - | - | | 7.2696 | 129500 | 0.0024 | - | - | - | | 7.2976 | 130000 | 0.0024 | 0.0012 | 0.8005 | 0.8138 | | 7.3257 | 130500 | 0.0024 | - | - | - | | 7.3538 | 131000 | 0.0024 | - | - | - | | 7.3818 | 131500 | 0.0024 | - | - | - | | 7.4099 | 132000 | 0.0024 | 0.0012 | 0.7995 | 0.8140 | | 7.4380 | 132500 | 0.0024 | - | - | - | | 7.4661 | 133000 | 0.0024 | - | - | - | | 7.4941 | 133500 | 0.0024 | - | - | - | | 7.5222 | 134000 | 0.0024 | 0.0012 | 0.7999 | 0.8142 | | 7.5503 | 134500 | 0.0024 | - | - | - | | 7.5783 | 135000 | 0.0024 | - | - | - | | 7.6064 | 135500 | 0.0024 | - | - | - | | 7.6345 | 136000 | 0.0024 | 0.0012 | 0.8011 | 0.8138 | | 7.6625 | 136500 | 0.0024 | - | - | - | | 7.6906 | 137000 | 0.0024 | - | - | - | | 7.7187 | 137500 | 0.0024 | - | - | - | | 7.7467 | 138000 | 0.0024 | 0.0012 | 0.8015 | 0.8142 | | 7.7748 | 138500 | 0.0024 | - | - | - | | 7.8029 | 139000 | 0.0024 | - | - | - | | 7.8309 | 139500 | 0.0024 | - | - | - | | 7.8590 | 140000 | 0.0024 | 0.0012 | 0.8007 | 0.8141 | | 7.8871 | 140500 | 0.0024 | - | - | - | | 7.9151 | 141000 | 0.0024 | - | - | - | | 7.9432 | 141500 | 0.0024 | - | - | - | | 7.9713 | 142000 | 0.0024 | 0.0012 | 0.8010 | 0.8143 | | 7.9994 | 142500 | 0.0024 | - | - | - |
### Framework Versions - Python: 3.10.16 - Sentence Transformers: 3.3.1 - Transformers: 4.51.3 - PyTorch: 2.5.1+cu124 - Accelerate: 1.2.1 - Datasets: 3.2.0 - Tokenizers: 0.21.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MSELoss ```bibtex @inproceedings{reimers-2020-multilingual-sentence-bert, title = "Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2020", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/2004.09813", } ```