--- tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:1760 - loss:MultipleNegativesRankingLoss base_model: WhereIsAI/UAE-Large-V1 widget: - source_sentence: What is the relationship between the x- and y-coordinates in a linear relationship, and how can this relationship be represented visually on a graph? sentences: - '"A linear relationship is a relationship between variables such that when plotted on a coordinate plane, the points lie on a line." Additionally, "You can think of a line, then, as a collection of an infinite number of individual points that share the same mathematical relationship."' - '"A ''model'' is a situation-specific description of a phenomenon based on a theory, that allows us to make a specific prediction." and "In physics, it is particularly important to distinguish between these two terms. A model provides an immediate understanding of something based on a theory."' - '"Use capital letters to denote sets, $A,B, C, X, Y$ etc. [...] if you stick with these conventions people reading your work (including the person marking your exams) will know — ''Oh $A$ is that set they are talking about'' and ''$a$ is an element of that set.''"' - source_sentence: What factors influence whether thin-film interference results in constructive or destructive interference? sentences: - '"For nonrelativistic velocities, an observer moving along at the same velocity as an Ohmic conductor measures the usual Ohm''s law in his reference frame, $\textbf{J}_{f}'' = \sigma \textbf{E}''$... the current density in all inertial frames is the same so that (3) in (4) gives us the generalized Ohm''s law as $\textbf{J}_{f}'' = \textbf{J}_{f} = \sigma (\textbf{E} + \textbf{v} \times \textbf{B})$ where v is the velocity of the conductor."' - '"Thin-film interference thus depends on film thickness, the wavelength of light, and the refractive indices."' - '"A summary of the properties of concave mirrors is shown below: • converging • real image • inverted • image in front of mirror. A summary of the properties of convex mirrors is shown below: • diverging • virtual image • upright • image behind mirror."' - source_sentence: How do non-conservative forces affect the total energy change in a system undergoing an irreversible process? sentences: - '"Energy is conserved but some mechanical energy has been transferred into nonrecoverable energy $W_{\mathrm{nc}}$. We shall refer to processes in which there is non-zero nonrecoverable energy as irreversible processes."' - '"Hamilton’s equations give $2s$ first-order differential equations for $p_{k},q_{k}$ for each of the $s=n-m$ degrees of freedom. Lagrange’s equations give $s$ second-order differential equations for the $s$ independent generalized coordinates $q_{k},\dot{q}_{k}."' - '"Determine what happens as $\Delta x$ approaches 0."' - source_sentence: What are the conditions under which a mutant virus is likely to replace a wildtype virus in a population, according to the SIR model of disease dynamics? sentences: - '"In the limit of high Reynolds number, viscosity disappears from the problem and the drag force should not depend on viscosity. This reasoning contains several subtle untruths, yet its conclusion is mostly correct. ... To make \( F \) independent of viscosity, \( F \) must be independent of Reynolds number!"' - '"A more mathematically rigorous name would be the renormalization monoid."' - '"I^{\prime}$ increases exponentially if $\frac{\beta^{\prime}(d+c+\gamma)}{\beta}-\left(d+c^{\prime}+\gamma^{\prime}\right)>0$ or after some elementary algebra, $\frac{\beta^{\prime}}{d+c^{\prime}+\gamma^{\prime}}>\frac{\beta}{d+c+\gamma}$." Additionally, "our result (4.6.8) suggests that endemic viruses (or other microorganisms) will tend to evolve (i) to be more easily transmitted between people $\left(\beta^{\prime}>\beta\right) ;$ (ii) to make people sick longer $\left(\gamma^{\prime}<\gamma\right)$, and; (iii) to be less deadly $c^{\prime} - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 1024 dimensions - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("cyberbabooshka/uae_large_ft1") # Run inference sentences = [ 'What is the relationship between the smallest perturbation of a matrix and its rank, as established in theorems regarding matrix perturbations?', '"Suppose $A \\in C^{m \\times n}$ has full column rank (= n). Then $\\min _{\\Delta \\in \\mathbb{C}^{m \\times n}}\\left\\{\\|\\Delta\\|_{2} \\mid A+\\Delta \\text { has rank }Click to see the direct usage in Transformers --> ## Evaluation ### Metrics #### Information Retrieval * Dataset: `eval` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.6143 | | cosine_accuracy@3 | 0.7357 | | cosine_accuracy@5 | 0.7833 | | cosine_accuracy@10 | 0.8381 | | cosine_precision@1 | 0.6143 | | cosine_precision@3 | 0.2452 | | cosine_precision@5 | 0.1567 | | cosine_precision@10 | 0.0838 | | cosine_recall@1 | 0.6143 | | cosine_recall@3 | 0.7357 | | cosine_recall@5 | 0.7833 | | cosine_recall@10 | 0.8381 | | **cosine_ndcg@10** | **0.7235** | | cosine_mrr@10 | 0.6871 | | cosine_map@100 | 0.6925 | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 1,760 training samples * Columns: anchor and positive * Approximate statistics based on the first 1000 samples: | | anchor | positive | |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | anchor | positive | |:---------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | How is a proper coloring of a graph defined in the context of vertices and edges? | "A coloring is called proper if for each edge joining two distinct vertices, the two vertices it joins have different colors." | | What is the relationship between the first excited state of the box model and the p orbitals in a hydrogen atom? | "The p orbitals are similar to the first excited state of the box, i.e. $(n_{x},n_{y},n_{z})=(2,1,1)$ is similar to a $p_{x}$ orbital, $(n_{x},n_{y},n_{z})=(1,2,1)$ is similar to a $p_{y}$ orbital and $(n_{x},n_{y},n_{z})=(1,1,2)$ is similar to a $p_{z}$ orbital." | | How can the behavior of the derivative \( f'(x) \) indicate the presence of a local maximum or minimum at a critical point \( x=a \)? | "If there is a local maximum when \( x=a \), the function must be lower near \( x=a \) than it is right at \( x=a \). If the derivative exists near \( x=a \), this means \( f'(x)>0 \) when \( x \) is near \( a \) and \( x < a \), because the function must 'slope up' just to the left of \( a \). Similarly, \( f'(x) < 0 \) when \( x \) is near \( a \) and \( x>a \), because \( f \) slopes down from the local maximum as we move to the right. Using the same reasoning, if there is a local minimum at \( x=a \), the derivative of \( f \) must be negative just to the left of \( a \) and positive just to the right." | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim" } ``` ### Evaluation Dataset #### Unnamed Dataset * Size: 420 evaluation samples * Columns: anchor and positive * Approximate statistics based on the first 420 samples: | | anchor | positive | |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | anchor | positive | |:------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | What are the two central classes mentioned in the FileSystem framework and what do they represent? | "The class `FileReference` is the most important entry point to the framework." and "FileSystem is a powerful and elegant library to manipulate files." | | What is the significance of Turing's work in the context of PDE-based models for self-organization of complex systems? | "Turing’s monumental work on the chemical basis of morphogenesis played an important role in igniting researchers’ attention to the PDE-based continuous field models as a mathematical framework to study self-organization of complex systems." | | What are the two options for reducing accelerations as discussed in the passage? | "From the above definitions we see that there are really two options for reducing accelerations. We can reduce the amount that velocity changes, or we can increase the time over which the velocity changes (or both)." | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: epoch - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `learning_rate`: 2e-05 - `weight_decay`: 0.05 - `num_train_epochs`: 10 - `warmup_ratio`: 0.1 - `fp16`: True - `eval_on_start`: True #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: epoch - `prediction_loss_only`: True - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 2e-05 - `weight_decay`: 0.05 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 10 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.1 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: True - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: True - `use_liger_kernel`: False - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: None - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: proportional
### Training Logs
Click to expand | Epoch | Step | Training Loss | Validation Loss | eval_cosine_ndcg@10 | |:------:|:----:|:-------------:|:---------------:|:-------------------:| | 0 | 0 | - | 0.0971 | 0.6824 | | 0.0091 | 1 | 0.1198 | - | - | | 0.0182 | 2 | 0.0787 | - | - | | 0.0273 | 3 | 0.0614 | - | - | | 0.0364 | 4 | 0.138 | - | - | | 0.0455 | 5 | 0.1204 | - | - | | 0.0545 | 6 | 0.1885 | - | - | | 0.0636 | 7 | 0.0475 | - | - | | 0.0727 | 8 | 0.1358 | - | - | | 0.0818 | 9 | 0.1666 | - | - | | 0.0909 | 10 | 0.0737 | - | - | | 0.1 | 11 | 0.0997 | - | - | | 0.1091 | 12 | 0.0795 | - | - | | 0.1182 | 13 | 0.1071 | - | - | | 0.1273 | 14 | 0.1224 | - | - | | 0.1364 | 15 | 0.0499 | - | - | | 0.1455 | 16 | 0.0806 | - | - | | 0.1545 | 17 | 0.0353 | - | - | | 0.1636 | 18 | 0.0542 | - | - | | 0.1727 | 19 | 0.0412 | - | - | | 0.1818 | 20 | 0.1375 | - | - | | 0.1909 | 21 | 0.1124 | - | - | | 0.2 | 22 | 0.0992 | - | - | | 0.2091 | 23 | 0.0285 | - | - | | 0.2182 | 24 | 0.0337 | - | - | | 0.2273 | 25 | 0.0737 | - | - | | 0.2364 | 26 | 0.2011 | - | - | | 0.2455 | 27 | 0.0241 | - | - | | 0.2545 | 28 | 0.1319 | - | - | | 0.2636 | 29 | 0.0104 | - | - | | 0.2727 | 30 | 0.0162 | - | - | | 0.2818 | 31 | 0.3061 | - | - | | 0.2909 | 32 | 0.0422 | - | - | | 0.3 | 33 | 0.1893 | - | - | | 0.3091 | 34 | 0.0207 | - | - | | 0.3182 | 35 | 0.0744 | - | - | | 0.3273 | 36 | 0.0246 | - | - | | 0.3364 | 37 | 0.0079 | - | - | | 0.3455 | 38 | 0.0256 | - | - | | 0.3545 | 39 | 0.0224 | - | - | | 0.3636 | 40 | 0.0151 | - | - | | 0.3727 | 41 | 0.0738 | - | - | | 0.3818 | 42 | 0.0239 | - | - | | 0.3909 | 43 | 0.0169 | - | - | | 0.4 | 44 | 0.0152 | - | - | | 0.4091 | 45 | 0.0244 | - | - | | 0.4182 | 46 | 0.1708 | - | - | | 0.4273 | 47 | 0.0146 | - | - | | 0.4364 | 48 | 0.1367 | - | - | | 0.4455 | 49 | 0.049 | - | - | | 0.4545 | 50 | 0.0211 | - | - | | 0.4636 | 51 | 0.0135 | - | - | | 0.4727 | 52 | 0.0668 | - | - | | 0.4818 | 53 | 0.087 | - | - | | 0.4909 | 54 | 0.0046 | - | - | | 0.5 | 55 | 0.0032 | - | - | | 0.5091 | 56 | 0.0133 | - | - | | 0.5182 | 57 | 0.0109 | - | - | | 0.5273 | 58 | 0.0396 | - | - | | 0.5364 | 59 | 0.0291 | - | - | | 0.5455 | 60 | 0.0299 | - | - | | 0.5545 | 61 | 0.0134 | - | - | | 0.5636 | 62 | 0.0135 | - | - | | 0.5727 | 63 | 0.0049 | - | - | | 0.5818 | 64 | 0.0199 | - | - | | 0.5909 | 65 | 0.1533 | - | - | | 0.6 | 66 | 0.3639 | - | - | | 0.6091 | 67 | 0.0652 | - | - | | 0.6182 | 68 | 0.0315 | - | - | | 0.6273 | 69 | 0.0403 | - | - | | 0.6364 | 70 | 0.011 | - | - | | 0.6455 | 71 | 0.0265 | - | - | | 0.6545 | 72 | 0.1146 | - | - | | 0.6636 | 73 | 0.0932 | - | - | | 0.6727 | 74 | 0.0234 | - | - | | 0.6818 | 75 | 0.0581 | - | - | | 0.6909 | 76 | 0.0132 | - | - | | 0.7 | 77 | 0.1183 | - | - | | 0.7091 | 78 | 0.0913 | - | - | | 0.7182 | 79 | 0.0262 | - | - | | 0.7273 | 80 | 0.0262 | - | - | | 0.7364 | 81 | 0.0159 | - | - | | 0.7455 | 82 | 0.0407 | - | - | | 0.7545 | 83 | 0.0294 | - | - | | 0.7636 | 84 | 0.0567 | - | - | | 0.7727 | 85 | 0.0959 | - | - | | 0.7818 | 86 | 0.033 | - | - | | 0.7909 | 87 | 0.0234 | - | - | | 0.8 | 88 | 0.0088 | - | - | | 0.8091 | 89 | 0.0249 | - | - | | 0.8182 | 90 | 0.0276 | - | - | | 0.8273 | 91 | 0.0936 | - | - | | 0.8364 | 92 | 0.0067 | - | - | | 0.8455 | 93 | 0.0064 | - | - | | 0.8545 | 94 | 0.0654 | - | - | | 0.8636 | 95 | 0.0048 | - | - | | 0.8727 | 96 | 0.0087 | - | - | | 0.8818 | 97 | 0.0115 | - | - | | 0.8909 | 98 | 0.0092 | - | - | | 0.9 | 99 | 0.0514 | - | - | | 0.9091 | 100 | 0.1856 | - | - | | 0.9182 | 101 | 0.0364 | - | - | | 0.9273 | 102 | 0.0455 | - | - | | 0.9364 | 103 | 0.0057 | - | - | | 0.9455 | 104 | 0.0038 | - | - | | 0.9545 | 105 | 0.0209 | - | - | | 0.9636 | 106 | 0.0247 | - | - | | 0.9727 | 107 | 0.0735 | - | - | | 0.9818 | 108 | 0.004 | - | - | | 0.9909 | 109 | 0.0174 | - | - | | 1.0 | 110 | 0.018 | 0.0282 | 0.7093 | | 1.0091 | 111 | 0.0187 | - | - | | 1.0182 | 112 | 0.0116 | - | - | | 1.0273 | 113 | 0.0043 | - | - | | 1.0364 | 114 | 0.0059 | - | - | | 1.0455 | 115 | 0.0067 | - | - | | 1.0545 | 116 | 0.0093 | - | - | | 1.0636 | 117 | 0.0821 | - | - | | 1.0727 | 118 | 0.0097 | - | - | | 1.0818 | 119 | 0.0141 | - | - | | 1.0909 | 120 | 0.0202 | - | - | | 1.1 | 121 | 0.0034 | - | - | | 1.1091 | 122 | 0.0025 | - | - | | 1.1182 | 123 | 0.006 | - | - | | 1.1273 | 124 | 0.004 | - | - | | 1.1364 | 125 | 0.003 | - | - | | 1.1455 | 126 | 0.0399 | - | - | | 1.1545 | 127 | 0.0026 | - | - | | 1.1636 | 128 | 0.0043 | - | - | | 1.1727 | 129 | 0.1317 | - | - | | 1.1818 | 130 | 0.0024 | - | - | | 1.1909 | 131 | 0.0027 | - | - | | 1.2 | 132 | 0.076 | - | - | | 1.2091 | 133 | 0.0302 | - | - | | 1.2182 | 134 | 0.0026 | - | - | | 1.2273 | 135 | 0.1611 | - | - | | 1.2364 | 136 | 0.0413 | - | - | | 1.2455 | 137 | 0.0118 | - | - | | 1.2545 | 138 | 0.0042 | - | - | | 1.2636 | 139 | 0.0401 | - | - | | 1.2727 | 140 | 0.0036 | - | - | | 1.2818 | 141 | 0.0034 | - | - | | 1.2909 | 142 | 0.0026 | - | - | | 1.3 | 143 | 0.0044 | - | - | | 1.3091 | 144 | 0.0024 | - | - | | 1.3182 | 145 | 0.0036 | - | - | | 1.3273 | 146 | 0.0242 | - | - | | 1.3364 | 147 | 0.0015 | - | - | | 1.3455 | 148 | 0.1008 | - | - | | 1.3545 | 149 | 0.0057 | - | - | | 1.3636 | 150 | 0.0062 | - | - | | 1.3727 | 151 | 0.0048 | - | - | | 1.3818 | 152 | 0.0026 | - | - | | 1.3909 | 153 | 0.0045 | - | - | | 1.4 | 154 | 0.0139 | - | - | | 1.4091 | 155 | 0.0017 | - | - | | 1.4182 | 156 | 0.0012 | - | - | | 1.4273 | 157 | 0.0009 | - | - | | 1.4364 | 158 | 0.006 | - | - | | 1.4455 | 159 | 0.0618 | - | - | | 1.4545 | 160 | 0.0889 | - | - | | 1.4636 | 161 | 0.0034 | - | - | | 1.4727 | 162 | 0.0184 | - | - | | 1.4818 | 163 | 0.0035 | - | - | | 1.4909 | 164 | 0.002 | - | - | | 1.5 | 165 | 0.0115 | - | - | | 1.5091 | 166 | 0.0008 | - | - | | 1.5182 | 167 | 0.0113 | - | - | | 1.5273 | 168 | 0.01 | - | - | | 1.5364 | 169 | 0.0177 | - | - | | 1.5455 | 170 | 0.0059 | - | - | | 1.5545 | 171 | 0.0123 | - | - | | 1.5636 | 172 | 0.0103 | - | - | | 1.5727 | 173 | 0.008 | - | - | | 1.5818 | 174 | 0.002 | - | - | | 1.5909 | 175 | 0.0039 | - | - | | 1.6 | 176 | 0.0174 | - | - | | 1.6091 | 177 | 0.0191 | - | - | | 1.6182 | 178 | 0.002 | - | - | | 1.6273 | 179 | 0.0009 | - | - | | 1.6364 | 180 | 0.0021 | - | - | | 1.6455 | 181 | 0.0011 | - | - | | 1.6545 | 182 | 0.0027 | - | - | | 1.6636 | 183 | 0.0005 | - | - | | 1.6727 | 184 | 0.0026 | - | - | | 1.6818 | 185 | 0.0047 | - | - | | 1.6909 | 186 | 0.0033 | - | - | | 1.7 | 187 | 0.0402 | - | - | | 1.7091 | 188 | 0.0128 | - | - | | 1.7182 | 189 | 0.01 | - | - | | 1.7273 | 190 | 0.0057 | - | - | | 1.7364 | 191 | 0.0133 | - | - | | 1.7455 | 192 | 0.0099 | - | - | | 1.7545 | 193 | 0.1022 | - | - | | 1.7636 | 194 | 0.0223 | - | - | | 1.7727 | 195 | 0.0037 | - | - | | 1.7818 | 196 | 0.0073 | - | - | | 1.7909 | 197 | 0.0212 | - | - | | 1.8 | 198 | 0.0231 | - | - | | 1.8091 | 199 | 0.0016 | - | - | | 1.8182 | 200 | 0.0017 | - | - | | 1.8273 | 201 | 0.0035 | - | - | | 1.8364 | 202 | 0.0165 | - | - | | 1.8455 | 203 | 0.0131 | - | - | | 1.8545 | 204 | 0.0032 | - | - | | 1.8636 | 205 | 0.0075 | - | - | | 1.8727 | 206 | 0.0438 | - | - | | 1.8818 | 207 | 0.0022 | - | - | | 1.8909 | 208 | 0.0501 | - | - | | 1.9 | 209 | 0.0121 | - | - | | 1.9091 | 210 | 0.0036 | - | - | | 1.9182 | 211 | 0.0041 | - | - | | 1.9273 | 212 | 0.0048 | - | - | | 1.9364 | 213 | 0.0159 | - | - | | 1.9455 | 214 | 0.0036 | - | - | | 1.9545 | 215 | 0.0035 | - | - | | 1.9636 | 216 | 0.004 | - | - | | 1.9727 | 217 | 0.0039 | - | - | | 1.9818 | 218 | 0.0177 | - | - | | 1.9909 | 219 | 0.0042 | - | - | | 2.0 | 220 | 0.0044 | 0.0230 | 0.7225 | | 2.0091 | 221 | 0.0339 | - | - | | 2.0182 | 222 | 0.0032 | - | - | | 2.0273 | 223 | 0.0133 | - | - | | 2.0364 | 224 | 0.0031 | - | - | | 2.0455 | 225 | 0.0025 | - | - | | 2.0545 | 226 | 0.0039 | - | - | | 2.0636 | 227 | 0.0011 | - | - | | 2.0727 | 228 | 0.0021 | - | - | | 2.0818 | 229 | 0.0591 | - | - | | 2.0909 | 230 | 0.0011 | - | - | | 2.1 | 231 | 0.0008 | - | - | | 2.1091 | 232 | 0.0014 | - | - | | 2.1182 | 233 | 0.0057 | - | - | | 2.1273 | 234 | 0.0044 | - | - | | 2.1364 | 235 | 0.001 | - | - | | 2.1455 | 236 | 0.0009 | - | - | | 2.1545 | 237 | 0.0028 | - | - | | 2.1636 | 238 | 0.0076 | - | - | | 2.1727 | 239 | 0.0018 | - | - | | 2.1818 | 240 | 0.0022 | - | - | | 2.1909 | 241 | 0.0029 | - | - | | 2.2 | 242 | 0.0004 | - | - | | 2.2091 | 243 | 0.0025 | - | - | | 2.2182 | 244 | 0.0013 | - | - | | 2.2273 | 245 | 0.0487 | - | - | | 2.2364 | 246 | 0.0016 | - | - | | 2.2455 | 247 | 0.0023 | - | - | | 2.2545 | 248 | 0.0038 | - | - | | 2.2636 | 249 | 0.003 | - | - | | 2.2727 | 250 | 0.0017 | - | - | | 2.2818 | 251 | 0.0056 | - | - | | 2.2909 | 252 | 0.0036 | - | - | | 2.3 | 253 | 0.0016 | - | - | | 2.3091 | 254 | 0.0021 | - | - | | 2.3182 | 255 | 0.0019 | - | - | | 2.3273 | 256 | 0.001 | - | - | | 2.3364 | 257 | 0.0017 | - | - | | 2.3455 | 258 | 0.0027 | - | - | | 2.3545 | 259 | 0.0039 | - | - | | 2.3636 | 260 | 0.0011 | - | - | | 2.3727 | 261 | 0.0248 | - | - | | 2.3818 | 262 | 0.0219 | - | - | | 2.3909 | 263 | 0.0015 | - | - | | 2.4 | 264 | 0.0009 | - | - | | 2.4091 | 265 | 0.0013 | - | - | | 2.4182 | 266 | 0.0049 | - | - | | 2.4273 | 267 | 0.0073 | - | - | | 2.4364 | 268 | 0.007 | - | - | | 2.4455 | 269 | 0.0024 | - | - | | 2.4545 | 270 | 0.0008 | - | - | | 2.4636 | 271 | 0.001 | - | - | | 2.4727 | 272 | 0.0016 | - | - | | 2.4818 | 273 | 0.0007 | - | - | | 2.4909 | 274 | 0.0091 | - | - | | 2.5 | 275 | 0.0127 | - | - | | 2.5091 | 276 | 0.0013 | - | - | | 2.5182 | 277 | 0.001 | - | - | | 2.5273 | 278 | 0.0006 | - | - | | 2.5364 | 279 | 0.005 | - | - | | 2.5455 | 280 | 0.0154 | - | - | | 2.5545 | 281 | 0.0015 | - | - | | 2.5636 | 282 | 0.0229 | - | - | | 2.5727 | 283 | 0.0026 | - | - | | 2.5818 | 284 | 0.0008 | - | - | | 2.5909 | 285 | 0.0024 | - | - | | 2.6 | 286 | 0.0012 | - | - | | 2.6091 | 287 | 0.0748 | - | - | | 2.6182 | 288 | 0.0086 | - | - | | 2.6273 | 289 | 0.0013 | - | - | | 2.6364 | 290 | 0.0089 | - | - | | 2.6455 | 291 | 0.0011 | - | - | | 2.6545 | 292 | 0.0096 | - | - | | 2.6636 | 293 | 0.1416 | - | - | | 2.6727 | 294 | 0.0005 | - | - | | 2.6818 | 295 | 0.0021 | - | - | | 2.6909 | 296 | 0.0014 | - | - | | 2.7 | 297 | 0.0097 | - | - | | 2.7091 | 298 | 0.0014 | - | - | | 2.7182 | 299 | 0.0009 | - | - | | 2.7273 | 300 | 0.0016 | - | - | | 2.7364 | 301 | 0.0166 | - | - | | 2.7455 | 302 | 0.0028 | - | - | | 2.7545 | 303 | 0.0014 | - | - | | 2.7636 | 304 | 0.0018 | - | - | | 2.7727 | 305 | 0.0059 | - | - | | 2.7818 | 306 | 0.0012 | - | - | | 2.7909 | 307 | 0.0008 | - | - | | 2.8 | 308 | 0.0007 | - | - | | 2.8091 | 309 | 0.0038 | - | - | | 2.8182 | 310 | 0.0012 | - | - | | 2.8273 | 311 | 0.0091 | - | - | | 2.8364 | 312 | 0.0111 | - | - | | 2.8455 | 313 | 0.0016 | - | - | | 2.8545 | 314 | 0.0089 | - | - | | 2.8636 | 315 | 0.0071 | - | - | | 2.8727 | 316 | 0.0012 | - | - | | 2.8818 | 317 | 0.0251 | - | - | | 2.8909 | 318 | 0.0017 | - | - | | 2.9 | 319 | 0.0006 | - | - | | 2.9091 | 320 | 0.0014 | - | - | | 2.9182 | 321 | 0.0011 | - | - | | 2.9273 | 322 | 0.0084 | - | - | | 2.9364 | 323 | 0.0055 | - | - | | 2.9455 | 324 | 0.0011 | - | - | | 2.9545 | 325 | 0.0017 | - | - | | 2.9636 | 326 | 0.0008 | - | - | | 2.9727 | 327 | 0.0082 | - | - | | 2.9818 | 328 | 0.0006 | - | - | | 2.9909 | 329 | 0.0008 | - | - | | 3.0 | 330 | 0.0022 | 0.0275 | 0.6950 | | 3.0091 | 331 | 0.0007 | - | - | | 3.0182 | 332 | 0.0012 | - | - | | 3.0273 | 333 | 0.0007 | - | - | | 3.0364 | 334 | 0.0038 | - | - | | 3.0455 | 335 | 0.0006 | - | - | | 3.0545 | 336 | 0.0012 | - | - | | 3.0636 | 337 | 0.0873 | - | - | | 3.0727 | 338 | 0.0022 | - | - | | 3.0818 | 339 | 0.0004 | - | - | | 3.0909 | 340 | 0.001 | - | - | | 3.1 | 341 | 0.0002 | - | - | | 3.1091 | 342 | 0.0069 | - | - | | 3.1182 | 343 | 0.0009 | - | - | | 3.1273 | 344 | 0.0101 | - | - | | 3.1364 | 345 | 0.0022 | - | - | | 3.1455 | 346 | 0.009 | - | - | | 3.1545 | 347 | 0.0018 | - | - | | 3.1636 | 348 | 0.0018 | - | - | | 3.1727 | 349 | 0.0045 | - | - | | 3.1818 | 350 | 0.029 | - | - | | 3.1909 | 351 | 0.0036 | - | - | | 3.2 | 352 | 0.0015 | - | - | | 3.2091 | 353 | 0.0021 | - | - | | 3.2182 | 354 | 0.0103 | - | - | | 3.2273 | 355 | 0.0005 | - | - | | 3.2364 | 356 | 0.0133 | - | - | | 3.2455 | 357 | 0.0015 | - | - | | 3.2545 | 358 | 0.001 | - | - | | 3.2636 | 359 | 0.0024 | - | - | | 3.2727 | 360 | 0.0052 | - | - | | 3.2818 | 361 | 0.0032 | - | - | | 3.2909 | 362 | 0.0024 | - | - | | 3.3 | 363 | 0.0008 | - | - | | 3.3091 | 364 | 0.0035 | - | - | | 3.3182 | 365 | 0.0012 | - | - | | 3.3273 | 366 | 0.0049 | - | - | | 3.3364 | 367 | 0.0452 | - | - | | 3.3455 | 368 | 0.0017 | - | - | | 3.3545 | 369 | 0.0112 | - | - | | 3.3636 | 370 | 0.0011 | - | - | | 3.3727 | 371 | 0.0016 | - | - | | 3.3818 | 372 | 0.0015 | - | - | | 3.3909 | 373 | 0.004 | - | - | | 3.4 | 374 | 0.0074 | - | - | | 3.4091 | 375 | 0.0005 | - | - | | 3.4182 | 376 | 0.0007 | - | - | | 3.4273 | 377 | 0.0014 | - | - | | 3.4364 | 378 | 0.0097 | - | - | | 3.4455 | 379 | 0.0026 | - | - | | 3.4545 | 380 | 0.0022 | - | - | | 3.4636 | 381 | 0.001 | - | - | | 3.4727 | 382 | 0.0004 | - | - | | 3.4818 | 383 | 0.004 | - | - | | 3.4909 | 384 | 0.0017 | - | - | | 3.5 | 385 | 0.0014 | - | - | | 3.5091 | 386 | 0.001 | - | - | | 3.5182 | 387 | 0.0047 | - | - | | 3.5273 | 388 | 0.0061 | - | - | | 3.5364 | 389 | 0.0017 | - | - | | 3.5455 | 390 | 0.0024 | - | - | | 3.5545 | 391 | 0.0021 | - | - | | 3.5636 | 392 | 0.0007 | - | - | | 3.5727 | 393 | 0.0009 | - | - | | 3.5818 | 394 | 0.0006 | - | - | | 3.5909 | 395 | 0.0038 | - | - | | 3.6 | 396 | 0.0006 | - | - | | 3.6091 | 397 | 0.0011 | - | - | | 3.6182 | 398 | 0.001 | - | - | | 3.6273 | 399 | 0.0014 | - | - | | 3.6364 | 400 | 0.0007 | - | - | | 3.6455 | 401 | 0.0052 | - | - | | 3.6545 | 402 | 0.0008 | - | - | | 3.6636 | 403 | 0.0009 | - | - | | 3.6727 | 404 | 0.0017 | - | - | | 3.6818 | 405 | 0.0028 | - | - | | 3.6909 | 406 | 0.0044 | - | - | | 3.7 | 407 | 0.0009 | - | - | | 3.7091 | 408 | 0.0134 | - | - | | 3.7182 | 409 | 0.001 | - | - | | 3.7273 | 410 | 0.0044 | - | - | | 3.7364 | 411 | 0.0138 | - | - | | 3.7455 | 412 | 0.0032 | - | - | | 3.7545 | 413 | 0.0004 | - | - | | 3.7636 | 414 | 0.0065 | - | - | | 3.7727 | 415 | 0.0007 | - | - | | 3.7818 | 416 | 0.0008 | - | - | | 3.7909 | 417 | 0.0007 | - | - | | 3.8 | 418 | 0.0018 | - | - | | 3.8091 | 419 | 0.001 | - | - | | 3.8182 | 420 | 0.0305 | - | - | | 3.8273 | 421 | 0.001 | - | - | | 3.8364 | 422 | 0.0011 | - | - | | 3.8455 | 423 | 0.0004 | - | - | | 3.8545 | 424 | 0.003 | - | - | | 3.8636 | 425 | 0.002 | - | - | | 3.8727 | 426 | 0.0018 | - | - | | 3.8818 | 427 | 0.0968 | - | - | | 3.8909 | 428 | 0.002 | - | - | | 3.9 | 429 | 0.002 | - | - | | 3.9091 | 430 | 0.0156 | - | - | | 3.9182 | 431 | 0.0059 | - | - | | 3.9273 | 432 | 0.001 | - | - | | 3.9364 | 433 | 0.0153 | - | - | | 3.9455 | 434 | 0.0013 | - | - | | 3.9545 | 435 | 0.0003 | - | - | | 3.9636 | 436 | 0.001 | - | - | | 3.9727 | 437 | 0.0005 | - | - | | 3.9818 | 438 | 0.0012 | - | - | | 3.9909 | 439 | 0.0109 | - | - | | 4.0 | 440 | 0.1597 | 0.0211 | 0.7235 | | 4.0091 | 441 | 0.0027 | - | - | | 4.0182 | 442 | 0.0007 | - | - | | 4.0273 | 443 | 0.0089 | - | - | | 4.0364 | 444 | 0.0007 | - | - | | 4.0455 | 445 | 0.005 | - | - | | 4.0545 | 446 | 0.0019 | - | - | | 4.0636 | 447 | 0.0007 | - | - | | 4.0727 | 448 | 0.0008 | - | - | | 4.0818 | 449 | 0.002 | - | - | | 4.0909 | 450 | 0.043 | - | - | | 4.1 | 451 | 0.0273 | - | - | | 4.1091 | 452 | 0.0009 | - | - | | 4.1182 | 453 | 0.0011 | - | - | | 4.1273 | 454 | 0.0007 | - | - | | 4.1364 | 455 | 0.0062 | - | - | | 4.1455 | 456 | 0.0004 | - | - | | 4.1545 | 457 | 0.0008 | - | - | | 4.1636 | 458 | 0.0128 | - | - | | 4.1727 | 459 | 0.0012 | - | - | | 4.1818 | 460 | 0.0013 | - | - | | 4.1909 | 461 | 0.0009 | - | - | | 4.2 | 462 | 0.0011 | - | - | | 4.2091 | 463 | 0.0336 | - | - | | 4.2182 | 464 | 0.0018 | - | - | | 4.2273 | 465 | 0.0009 | - | - | | 4.2364 | 466 | 0.0049 | - | - | | 4.2455 | 467 | 0.0012 | - | - | | 4.2545 | 468 | 0.001 | - | - | | 4.2636 | 469 | 0.0024 | - | - | | 4.2727 | 470 | 0.0063 | - | - | | 4.2818 | 471 | 0.0008 | - | - | | 4.2909 | 472 | 0.0793 | - | - | | 4.3 | 473 | 0.0016 | - | - | | 4.3091 | 474 | 0.0016 | - | - | | 4.3182 | 475 | 0.0043 | - | - | | 4.3273 | 476 | 0.036 | - | - | | 4.3364 | 477 | 0.002 | - | - | | 4.3455 | 478 | 0.0019 | - | - | | 4.3545 | 479 | 0.0012 | - | - | | 4.3636 | 480 | 0.0059 | - | - | | 4.3727 | 481 | 0.0017 | - | - | | 4.3818 | 482 | 0.0004 | - | - | | 4.3909 | 483 | 0.0014 | - | - | | 4.4 | 484 | 0.0143 | - | - | | 4.4091 | 485 | 0.0014 | - | - | | 4.4182 | 486 | 0.0009 | - | - | | 4.4273 | 487 | 0.0027 | - | - | | 4.4364 | 488 | 0.0017 | - | - | | 4.4455 | 489 | 0.0007 | - | - | | 4.4545 | 490 | 0.0008 | - | - | | 4.4636 | 491 | 0.0008 | - | - | | 4.4727 | 492 | 0.0014 | - | - | | 4.4818 | 493 | 0.0011 | - | - | | 4.4909 | 494 | 0.0013 | - | - | | 4.5 | 495 | 0.0016 | - | - | | 4.5091 | 496 | 0.001 | - | - | | 4.5182 | 497 | 0.0008 | - | - | | 4.5273 | 498 | 0.001 | - | - | | 4.5364 | 499 | 0.0019 | - | - | | 4.5455 | 500 | 0.0008 | - | - |
### Framework Versions - Python: 3.12.9 - Sentence Transformers: 4.1.0 - Transformers: 4.52.3 - PyTorch: 2.6.0+cu124 - Accelerate: 1.7.0 - Datasets: 3.6.0 - Tokenizers: 0.21.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```