metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1760
- loss:MultipleNegativesRankingLoss
base_model: WhereIsAI/UAE-Large-V1
widget:
- source_sentence: >-
What is the relationship between the x- and y-coordinates in a linear
relationship, and how can this relationship be represented visually on a
graph?
sentences:
- >-
"A linear relationship is a relationship between variables such that
when plotted on a coordinate plane, the points lie on a line."
Additionally, "You can think of a line, then, as a collection of an
infinite number of individual points that share the same mathematical
relationship."
- >-
"A 'model' is a situation-specific description of a phenomenon based on
a theory, that allows us to make a specific prediction." and "In
physics, it is particularly important to distinguish between these two
terms. A model provides an immediate understanding of something based on
a theory."
- >-
"Use capital letters to denote sets, $A,B, C, X, Y$ etc. [...] if you
stick with these conventions people reading your work (including the
person marking your exams) will know — 'Oh $A$ is that set they are
talking about' and '$a$ is an element of that set.'"
- source_sentence: >-
What factors influence whether thin-film interference results in
constructive or destructive interference?
sentences:
- >-
"For nonrelativistic velocities, an observer moving along at the same
velocity as an Ohmic conductor measures the usual Ohm's law in his
reference frame, $\textbf{J}_{f}' = \sigma \textbf{E}'$... the current
density in all inertial frames is the same so that (3) in (4) gives us
the generalized Ohm's law as $\textbf{J}_{f}' = \textbf{J}_{f} = \sigma
(\textbf{E} + \textbf{v} \times \textbf{B})$ where v is the velocity of
the conductor."
- >-
"Thin-film interference thus depends on film thickness, the wavelength
of light, and the refractive indices."
- >-
"A summary of the properties of concave mirrors is shown below: •
converging • real image • inverted • image in front of mirror. A summary
of the properties of convex mirrors is shown below: • diverging •
virtual image • upright • image behind mirror."
- source_sentence: >-
How do non-conservative forces affect the total energy change in a system
undergoing an irreversible process?
sentences:
- >-
"Energy is conserved but some mechanical energy has been transferred
into nonrecoverable energy $W_{\mathrm{nc}}$. We shall refer to
processes in which there is non-zero nonrecoverable energy as
irreversible processes."
- >-
"Hamilton’s equations give $2s$ first-order differential equations for
$p_{k},q_{k}$ for each of the $s=n-m$ degrees of freedom. Lagrange’s
equations give $s$ second-order differential equations for the $s$
independent generalized coordinates $q_{k},\dot{q}_{k}."
- '"Determine what happens as $\Delta x$ approaches 0."'
- source_sentence: >-
What are the conditions under which a mutant virus is likely to replace a
wildtype virus in a population, according to the SIR model of disease
dynamics?
sentences:
- >-
"In the limit of high Reynolds number, viscosity disappears from the
problem and the drag force should not depend on viscosity. This
reasoning contains several subtle untruths, yet its conclusion is mostly
correct. ... To make \( F \) independent of viscosity, \( F \) must be
independent of Reynolds number!"
- >-
"A more mathematically rigorous name would be the renormalization
monoid."
- >-
"I^{\prime}$ increases exponentially if
$\frac{\beta^{\prime}(d+c+\gamma)}{\beta}-\left(d+c^{\prime}+\gamma^{\prime}\right)>0$
or after some elementary algebra,
$\frac{\beta^{\prime}}{d+c^{\prime}+\gamma^{\prime}}>\frac{\beta}{d+c+\gamma}$."
Additionally, "our result (4.6.8) suggests that endemic viruses (or
other microorganisms) will tend to evolve (i) to be more easily
transmitted between people $\left(\beta^{\prime}>\beta\right) ;$ (ii) to
make people sick longer $\left(\gamma^{\prime}<\gamma\right)$, and;
(iii) to be less deadly $c^{\prime}<c$."
- source_sentence: >-
What is the relationship between the smallest perturbation of a matrix and
its rank, as established in theorems regarding matrix perturbations?
sentences:
- >-
"Suppose $A \in C^{m \times n}$ has full column rank (= n). Then $\min
_{\Delta \in \mathbb{C}^{m \times n}}\left\{\|\Delta\|_{2} \mid A+\Delta
\text { has rank }<n\right\}=\sigma_{n}(A)$."
- '"Complementary angles have measures that add up to 90 degrees."'
- >-
"If a beam of light enters and then exits the elevator, the observer on
Earth and the one accelerating in empty space must observe the same
thing, since they cannot distinguish between being on Earth or
accelerating in space. The observer in space, who is accelerating, will
observe that the beam of light bends as it crosses the elevator... that
means that if the path of a beam of light is curved near Earth, it must
be because space itself is curved in the presence of a gravitational
field!"
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on WhereIsAI/UAE-Large-V1
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: eval
type: eval
metrics:
- type: cosine_accuracy@1
value: 0.6142857142857143
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.7357142857142858
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.7833333333333333
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.8380952380952381
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6142857142857143
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.24523809523809523
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.15666666666666665
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.08380952380952378
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.6142857142857143
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.7357142857142858
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.7833333333333333
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.8380952380952381
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.7234956246301203
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.6871305744520029
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6925322242948972
name: Cosine Map@100
SentenceTransformer based on WhereIsAI/UAE-Large-V1
This is a sentence-transformers model finetuned from WhereIsAI/UAE-Large-V1. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: WhereIsAI/UAE-Large-V1
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("cyberbabooshka/uae_large_ft1")
# Run inference
sentences = [
'What is the relationship between the smallest perturbation of a matrix and its rank, as established in theorems regarding matrix perturbations?',
'"Suppose $A \\in C^{m \\times n}$ has full column rank (= n). Then $\\min _{\\Delta \\in \\mathbb{C}^{m \\times n}}\\left\\{\\|\\Delta\\|_{2} \\mid A+\\Delta \\text { has rank }<n\\right\\}=\\sigma_{n}(A)$."',
'"If a beam of light enters and then exits the elevator, the observer on Earth and the one accelerating in empty space must observe the same thing, since they cannot distinguish between being on Earth or accelerating in space. The observer in space, who is accelerating, will observe that the beam of light bends as it crosses the elevator... that means that if the path of a beam of light is curved near Earth, it must be because space itself is curved in the presence of a gravitational field!"',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
eval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.6143 |
cosine_accuracy@3 | 0.7357 |
cosine_accuracy@5 | 0.7833 |
cosine_accuracy@10 | 0.8381 |
cosine_precision@1 | 0.6143 |
cosine_precision@3 | 0.2452 |
cosine_precision@5 | 0.1567 |
cosine_precision@10 | 0.0838 |
cosine_recall@1 | 0.6143 |
cosine_recall@3 | 0.7357 |
cosine_recall@5 | 0.7833 |
cosine_recall@10 | 0.8381 |
cosine_ndcg@10 | 0.7235 |
cosine_mrr@10 | 0.6871 |
cosine_map@100 | 0.6925 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,760 training samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 9 tokens
- mean: 24.87 tokens
- max: 70 tokens
- min: 11 tokens
- mean: 68.37 tokens
- max: 500 tokens
- Samples:
anchor positive How is a proper coloring of a graph defined in the context of vertices and edges?
"A coloring is called proper if for each edge joining two distinct vertices, the two vertices it joins have different colors."
What is the relationship between the first excited state of the box model and the p orbitals in a hydrogen atom?
"The p orbitals are similar to the first excited state of the box, i.e. $(n_{x},n_{y},n_{z})=(2,1,1)$ is similar to a $p_{x}$ orbital, $(n_{x},n_{y},n_{z})=(1,2,1)$ is similar to a $p_{y}$ orbital and $(n_{x},n_{y},n_{z})=(1,1,2)$ is similar to a $p_{z}$ orbital."
How can the behavior of the derivative ( f'(x) ) indicate the presence of a local maximum or minimum at a critical point ( x=a )?
"If there is a local maximum when ( x=a ), the function must be lower near ( x=a ) than it is right at ( x=a ). If the derivative exists near ( x=a ), this means ( f'(x)>0 ) when ( x ) is near ( a ) and ( x < a ), because the function must 'slope up' just to the left of ( a ). Similarly, ( f'(x) < 0 ) when ( x ) is near ( a ) and ( x>a ), because ( f ) slopes down from the local maximum as we move to the right. Using the same reasoning, if there is a local minimum at ( x=a ), the derivative of ( f ) must be negative just to the left of ( a ) and positive just to the right."
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 420 evaluation samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 420 samples:
anchor positive type string string details - min: 12 tokens
- mean: 24.97 tokens
- max: 66 tokens
- min: 7 tokens
- mean: 68.52 tokens
- max: 452 tokens
- Samples:
anchor positive What are the two central classes mentioned in the FileSystem framework and what do they represent?
"The class
FileReference
is the most important entry point to the framework." and "FileSystem is a powerful and elegant library to manipulate files."What is the significance of Turing's work in the context of PDE-based models for self-organization of complex systems?
"Turing’s monumental work on the chemical basis of morphogenesis played an important role in igniting researchers’ attention to the PDE-based continuous field models as a mathematical framework to study self-organization of complex systems."
What are the two options for reducing accelerations as discussed in the passage?
"From the above definitions we see that there are really two options for reducing accelerations. We can reduce the amount that velocity changes, or we can increase the time over which the velocity changes (or both)."
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 16per_device_eval_batch_size
: 16learning_rate
: 2e-05weight_decay
: 0.05num_train_epochs
: 10warmup_ratio
: 0.1fp16
: Trueeval_on_start
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.05adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 10max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Trueuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | Validation Loss | eval_cosine_ndcg@10 |
---|---|---|---|---|
0 | 0 | - | 0.0971 | 0.6824 |
0.0091 | 1 | 0.1198 | - | - |
0.0182 | 2 | 0.0787 | - | - |
0.0273 | 3 | 0.0614 | - | - |
0.0364 | 4 | 0.138 | - | - |
0.0455 | 5 | 0.1204 | - | - |
0.0545 | 6 | 0.1885 | - | - |
0.0636 | 7 | 0.0475 | - | - |
0.0727 | 8 | 0.1358 | - | - |
0.0818 | 9 | 0.1666 | - | - |
0.0909 | 10 | 0.0737 | - | - |
0.1 | 11 | 0.0997 | - | - |
0.1091 | 12 | 0.0795 | - | - |
0.1182 | 13 | 0.1071 | - | - |
0.1273 | 14 | 0.1224 | - | - |
0.1364 | 15 | 0.0499 | - | - |
0.1455 | 16 | 0.0806 | - | - |
0.1545 | 17 | 0.0353 | - | - |
0.1636 | 18 | 0.0542 | - | - |
0.1727 | 19 | 0.0412 | - | - |
0.1818 | 20 | 0.1375 | - | - |
0.1909 | 21 | 0.1124 | - | - |
0.2 | 22 | 0.0992 | - | - |
0.2091 | 23 | 0.0285 | - | - |
0.2182 | 24 | 0.0337 | - | - |
0.2273 | 25 | 0.0737 | - | - |
0.2364 | 26 | 0.2011 | - | - |
0.2455 | 27 | 0.0241 | - | - |
0.2545 | 28 | 0.1319 | - | - |
0.2636 | 29 | 0.0104 | - | - |
0.2727 | 30 | 0.0162 | - | - |
0.2818 | 31 | 0.3061 | - | - |
0.2909 | 32 | 0.0422 | - | - |
0.3 | 33 | 0.1893 | - | - |
0.3091 | 34 | 0.0207 | - | - |
0.3182 | 35 | 0.0744 | - | - |
0.3273 | 36 | 0.0246 | - | - |
0.3364 | 37 | 0.0079 | - | - |
0.3455 | 38 | 0.0256 | - | - |
0.3545 | 39 | 0.0224 | - | - |
0.3636 | 40 | 0.0151 | - | - |
0.3727 | 41 | 0.0738 | - | - |
0.3818 | 42 | 0.0239 | - | - |
0.3909 | 43 | 0.0169 | - | - |
0.4 | 44 | 0.0152 | - | - |
0.4091 | 45 | 0.0244 | - | - |
0.4182 | 46 | 0.1708 | - | - |
0.4273 | 47 | 0.0146 | - | - |
0.4364 | 48 | 0.1367 | - | - |
0.4455 | 49 | 0.049 | - | - |
0.4545 | 50 | 0.0211 | - | - |
0.4636 | 51 | 0.0135 | - | - |
0.4727 | 52 | 0.0668 | - | - |
0.4818 | 53 | 0.087 | - | - |
0.4909 | 54 | 0.0046 | - | - |
0.5 | 55 | 0.0032 | - | - |
0.5091 | 56 | 0.0133 | - | - |
0.5182 | 57 | 0.0109 | - | - |
0.5273 | 58 | 0.0396 | - | - |
0.5364 | 59 | 0.0291 | - | - |
0.5455 | 60 | 0.0299 | - | - |
0.5545 | 61 | 0.0134 | - | - |
0.5636 | 62 | 0.0135 | - | - |
0.5727 | 63 | 0.0049 | - | - |
0.5818 | 64 | 0.0199 | - | - |
0.5909 | 65 | 0.1533 | - | - |
0.6 | 66 | 0.3639 | - | - |
0.6091 | 67 | 0.0652 | - | - |
0.6182 | 68 | 0.0315 | - | - |
0.6273 | 69 | 0.0403 | - | - |
0.6364 | 70 | 0.011 | - | - |
0.6455 | 71 | 0.0265 | - | - |
0.6545 | 72 | 0.1146 | - | - |
0.6636 | 73 | 0.0932 | - | - |
0.6727 | 74 | 0.0234 | - | - |
0.6818 | 75 | 0.0581 | - | - |
0.6909 | 76 | 0.0132 | - | - |
0.7 | 77 | 0.1183 | - | - |
0.7091 | 78 | 0.0913 | - | - |
0.7182 | 79 | 0.0262 | - | - |
0.7273 | 80 | 0.0262 | - | - |
0.7364 | 81 | 0.0159 | - | - |
0.7455 | 82 | 0.0407 | - | - |
0.7545 | 83 | 0.0294 | - | - |
0.7636 | 84 | 0.0567 | - | - |
0.7727 | 85 | 0.0959 | - | - |
0.7818 | 86 | 0.033 | - | - |
0.7909 | 87 | 0.0234 | - | - |
0.8 | 88 | 0.0088 | - | - |
0.8091 | 89 | 0.0249 | - | - |
0.8182 | 90 | 0.0276 | - | - |
0.8273 | 91 | 0.0936 | - | - |
0.8364 | 92 | 0.0067 | - | - |
0.8455 | 93 | 0.0064 | - | - |
0.8545 | 94 | 0.0654 | - | - |
0.8636 | 95 | 0.0048 | - | - |
0.8727 | 96 | 0.0087 | - | - |
0.8818 | 97 | 0.0115 | - | - |
0.8909 | 98 | 0.0092 | - | - |
0.9 | 99 | 0.0514 | - | - |
0.9091 | 100 | 0.1856 | - | - |
0.9182 | 101 | 0.0364 | - | - |
0.9273 | 102 | 0.0455 | - | - |
0.9364 | 103 | 0.0057 | - | - |
0.9455 | 104 | 0.0038 | - | - |
0.9545 | 105 | 0.0209 | - | - |
0.9636 | 106 | 0.0247 | - | - |
0.9727 | 107 | 0.0735 | - | - |
0.9818 | 108 | 0.004 | - | - |
0.9909 | 109 | 0.0174 | - | - |
1.0 | 110 | 0.018 | 0.0282 | 0.7093 |
1.0091 | 111 | 0.0187 | - | - |
1.0182 | 112 | 0.0116 | - | - |
1.0273 | 113 | 0.0043 | - | - |
1.0364 | 114 | 0.0059 | - | - |
1.0455 | 115 | 0.0067 | - | - |
1.0545 | 116 | 0.0093 | - | - |
1.0636 | 117 | 0.0821 | - | - |
1.0727 | 118 | 0.0097 | - | - |
1.0818 | 119 | 0.0141 | - | - |
1.0909 | 120 | 0.0202 | - | - |
1.1 | 121 | 0.0034 | - | - |
1.1091 | 122 | 0.0025 | - | - |
1.1182 | 123 | 0.006 | - | - |
1.1273 | 124 | 0.004 | - | - |
1.1364 | 125 | 0.003 | - | - |
1.1455 | 126 | 0.0399 | - | - |
1.1545 | 127 | 0.0026 | - | - |
1.1636 | 128 | 0.0043 | - | - |
1.1727 | 129 | 0.1317 | - | - |
1.1818 | 130 | 0.0024 | - | - |
1.1909 | 131 | 0.0027 | - | - |
1.2 | 132 | 0.076 | - | - |
1.2091 | 133 | 0.0302 | - | - |
1.2182 | 134 | 0.0026 | - | - |
1.2273 | 135 | 0.1611 | - | - |
1.2364 | 136 | 0.0413 | - | - |
1.2455 | 137 | 0.0118 | - | - |
1.2545 | 138 | 0.0042 | - | - |
1.2636 | 139 | 0.0401 | - | - |
1.2727 | 140 | 0.0036 | - | - |
1.2818 | 141 | 0.0034 | - | - |
1.2909 | 142 | 0.0026 | - | - |
1.3 | 143 | 0.0044 | - | - |
1.3091 | 144 | 0.0024 | - | - |
1.3182 | 145 | 0.0036 | - | - |
1.3273 | 146 | 0.0242 | - | - |
1.3364 | 147 | 0.0015 | - | - |
1.3455 | 148 | 0.1008 | - | - |
1.3545 | 149 | 0.0057 | - | - |
1.3636 | 150 | 0.0062 | - | - |
1.3727 | 151 | 0.0048 | - | - |
1.3818 | 152 | 0.0026 | - | - |
1.3909 | 153 | 0.0045 | - | - |
1.4 | 154 | 0.0139 | - | - |
1.4091 | 155 | 0.0017 | - | - |
1.4182 | 156 | 0.0012 | - | - |
1.4273 | 157 | 0.0009 | - | - |
1.4364 | 158 | 0.006 | - | - |
1.4455 | 159 | 0.0618 | - | - |
1.4545 | 160 | 0.0889 | - | - |
1.4636 | 161 | 0.0034 | - | - |
1.4727 | 162 | 0.0184 | - | - |
1.4818 | 163 | 0.0035 | - | - |
1.4909 | 164 | 0.002 | - | - |
1.5 | 165 | 0.0115 | - | - |
1.5091 | 166 | 0.0008 | - | - |
1.5182 | 167 | 0.0113 | - | - |
1.5273 | 168 | 0.01 | - | - |
1.5364 | 169 | 0.0177 | - | - |
1.5455 | 170 | 0.0059 | - | - |
1.5545 | 171 | 0.0123 | - | - |
1.5636 | 172 | 0.0103 | - | - |
1.5727 | 173 | 0.008 | - | - |
1.5818 | 174 | 0.002 | - | - |
1.5909 | 175 | 0.0039 | - | - |
1.6 | 176 | 0.0174 | - | - |
1.6091 | 177 | 0.0191 | - | - |
1.6182 | 178 | 0.002 | - | - |
1.6273 | 179 | 0.0009 | - | - |
1.6364 | 180 | 0.0021 | - | - |
1.6455 | 181 | 0.0011 | - | - |
1.6545 | 182 | 0.0027 | - | - |
1.6636 | 183 | 0.0005 | - | - |
1.6727 | 184 | 0.0026 | - | - |
1.6818 | 185 | 0.0047 | - | - |
1.6909 | 186 | 0.0033 | - | - |
1.7 | 187 | 0.0402 | - | - |
1.7091 | 188 | 0.0128 | - | - |
1.7182 | 189 | 0.01 | - | - |
1.7273 | 190 | 0.0057 | - | - |
1.7364 | 191 | 0.0133 | - | - |
1.7455 | 192 | 0.0099 | - | - |
1.7545 | 193 | 0.1022 | - | - |
1.7636 | 194 | 0.0223 | - | - |
1.7727 | 195 | 0.0037 | - | - |
1.7818 | 196 | 0.0073 | - | - |
1.7909 | 197 | 0.0212 | - | - |
1.8 | 198 | 0.0231 | - | - |
1.8091 | 199 | 0.0016 | - | - |
1.8182 | 200 | 0.0017 | - | - |
1.8273 | 201 | 0.0035 | - | - |
1.8364 | 202 | 0.0165 | - | - |
1.8455 | 203 | 0.0131 | - | - |
1.8545 | 204 | 0.0032 | - | - |
1.8636 | 205 | 0.0075 | - | - |
1.8727 | 206 | 0.0438 | - | - |
1.8818 | 207 | 0.0022 | - | - |
1.8909 | 208 | 0.0501 | - | - |
1.9 | 209 | 0.0121 | - | - |
1.9091 | 210 | 0.0036 | - | - |
1.9182 | 211 | 0.0041 | - | - |
1.9273 | 212 | 0.0048 | - | - |
1.9364 | 213 | 0.0159 | - | - |
1.9455 | 214 | 0.0036 | - | - |
1.9545 | 215 | 0.0035 | - | - |
1.9636 | 216 | 0.004 | - | - |
1.9727 | 217 | 0.0039 | - | - |
1.9818 | 218 | 0.0177 | - | - |
1.9909 | 219 | 0.0042 | - | - |
2.0 | 220 | 0.0044 | 0.0230 | 0.7225 |
2.0091 | 221 | 0.0339 | - | - |
2.0182 | 222 | 0.0032 | - | - |
2.0273 | 223 | 0.0133 | - | - |
2.0364 | 224 | 0.0031 | - | - |
2.0455 | 225 | 0.0025 | - | - |
2.0545 | 226 | 0.0039 | - | - |
2.0636 | 227 | 0.0011 | - | - |
2.0727 | 228 | 0.0021 | - | - |
2.0818 | 229 | 0.0591 | - | - |
2.0909 | 230 | 0.0011 | - | - |
2.1 | 231 | 0.0008 | - | - |
2.1091 | 232 | 0.0014 | - | - |
2.1182 | 233 | 0.0057 | - | - |
2.1273 | 234 | 0.0044 | - | - |
2.1364 | 235 | 0.001 | - | - |
2.1455 | 236 | 0.0009 | - | - |
2.1545 | 237 | 0.0028 | - | - |
2.1636 | 238 | 0.0076 | - | - |
2.1727 | 239 | 0.0018 | - | - |
2.1818 | 240 | 0.0022 | - | - |
2.1909 | 241 | 0.0029 | - | - |
2.2 | 242 | 0.0004 | - | - |
2.2091 | 243 | 0.0025 | - | - |
2.2182 | 244 | 0.0013 | - | - |
2.2273 | 245 | 0.0487 | - | - |
2.2364 | 246 | 0.0016 | - | - |
2.2455 | 247 | 0.0023 | - | - |
2.2545 | 248 | 0.0038 | - | - |
2.2636 | 249 | 0.003 | - | - |
2.2727 | 250 | 0.0017 | - | - |
2.2818 | 251 | 0.0056 | - | - |
2.2909 | 252 | 0.0036 | - | - |
2.3 | 253 | 0.0016 | - | - |
2.3091 | 254 | 0.0021 | - | - |
2.3182 | 255 | 0.0019 | - | - |
2.3273 | 256 | 0.001 | - | - |
2.3364 | 257 | 0.0017 | - | - |
2.3455 | 258 | 0.0027 | - | - |
2.3545 | 259 | 0.0039 | - | - |
2.3636 | 260 | 0.0011 | - | - |
2.3727 | 261 | 0.0248 | - | - |
2.3818 | 262 | 0.0219 | - | - |
2.3909 | 263 | 0.0015 | - | - |
2.4 | 264 | 0.0009 | - | - |
2.4091 | 265 | 0.0013 | - | - |
2.4182 | 266 | 0.0049 | - | - |
2.4273 | 267 | 0.0073 | - | - |
2.4364 | 268 | 0.007 | - | - |
2.4455 | 269 | 0.0024 | - | - |
2.4545 | 270 | 0.0008 | - | - |
2.4636 | 271 | 0.001 | - | - |
2.4727 | 272 | 0.0016 | - | - |
2.4818 | 273 | 0.0007 | - | - |
2.4909 | 274 | 0.0091 | - | - |
2.5 | 275 | 0.0127 | - | - |
2.5091 | 276 | 0.0013 | - | - |
2.5182 | 277 | 0.001 | - | - |
2.5273 | 278 | 0.0006 | - | - |
2.5364 | 279 | 0.005 | - | - |
2.5455 | 280 | 0.0154 | - | - |
2.5545 | 281 | 0.0015 | - | - |
2.5636 | 282 | 0.0229 | - | - |
2.5727 | 283 | 0.0026 | - | - |
2.5818 | 284 | 0.0008 | - | - |
2.5909 | 285 | 0.0024 | - | - |
2.6 | 286 | 0.0012 | - | - |
2.6091 | 287 | 0.0748 | - | - |
2.6182 | 288 | 0.0086 | - | - |
2.6273 | 289 | 0.0013 | - | - |
2.6364 | 290 | 0.0089 | - | - |
2.6455 | 291 | 0.0011 | - | - |
2.6545 | 292 | 0.0096 | - | - |
2.6636 | 293 | 0.1416 | - | - |
2.6727 | 294 | 0.0005 | - | - |
2.6818 | 295 | 0.0021 | - | - |
2.6909 | 296 | 0.0014 | - | - |
2.7 | 297 | 0.0097 | - | - |
2.7091 | 298 | 0.0014 | - | - |
2.7182 | 299 | 0.0009 | - | - |
2.7273 | 300 | 0.0016 | - | - |
2.7364 | 301 | 0.0166 | - | - |
2.7455 | 302 | 0.0028 | - | - |
2.7545 | 303 | 0.0014 | - | - |
2.7636 | 304 | 0.0018 | - | - |
2.7727 | 305 | 0.0059 | - | - |
2.7818 | 306 | 0.0012 | - | - |
2.7909 | 307 | 0.0008 | - | - |
2.8 | 308 | 0.0007 | - | - |
2.8091 | 309 | 0.0038 | - | - |
2.8182 | 310 | 0.0012 | - | - |
2.8273 | 311 | 0.0091 | - | - |
2.8364 | 312 | 0.0111 | - | - |
2.8455 | 313 | 0.0016 | - | - |
2.8545 | 314 | 0.0089 | - | - |
2.8636 | 315 | 0.0071 | - | - |
2.8727 | 316 | 0.0012 | - | - |
2.8818 | 317 | 0.0251 | - | - |
2.8909 | 318 | 0.0017 | - | - |
2.9 | 319 | 0.0006 | - | - |
2.9091 | 320 | 0.0014 | - | - |
2.9182 | 321 | 0.0011 | - | - |
2.9273 | 322 | 0.0084 | - | - |
2.9364 | 323 | 0.0055 | - | - |
2.9455 | 324 | 0.0011 | - | - |
2.9545 | 325 | 0.0017 | - | - |
2.9636 | 326 | 0.0008 | - | - |
2.9727 | 327 | 0.0082 | - | - |
2.9818 | 328 | 0.0006 | - | - |
2.9909 | 329 | 0.0008 | - | - |
3.0 | 330 | 0.0022 | 0.0275 | 0.6950 |
3.0091 | 331 | 0.0007 | - | - |
3.0182 | 332 | 0.0012 | - | - |
3.0273 | 333 | 0.0007 | - | - |
3.0364 | 334 | 0.0038 | - | - |
3.0455 | 335 | 0.0006 | - | - |
3.0545 | 336 | 0.0012 | - | - |
3.0636 | 337 | 0.0873 | - | - |
3.0727 | 338 | 0.0022 | - | - |
3.0818 | 339 | 0.0004 | - | - |
3.0909 | 340 | 0.001 | - | - |
3.1 | 341 | 0.0002 | - | - |
3.1091 | 342 | 0.0069 | - | - |
3.1182 | 343 | 0.0009 | - | - |
3.1273 | 344 | 0.0101 | - | - |
3.1364 | 345 | 0.0022 | - | - |
3.1455 | 346 | 0.009 | - | - |
3.1545 | 347 | 0.0018 | - | - |
3.1636 | 348 | 0.0018 | - | - |
3.1727 | 349 | 0.0045 | - | - |
3.1818 | 350 | 0.029 | - | - |
3.1909 | 351 | 0.0036 | - | - |
3.2 | 352 | 0.0015 | - | - |
3.2091 | 353 | 0.0021 | - | - |
3.2182 | 354 | 0.0103 | - | - |
3.2273 | 355 | 0.0005 | - | - |
3.2364 | 356 | 0.0133 | - | - |
3.2455 | 357 | 0.0015 | - | - |
3.2545 | 358 | 0.001 | - | - |
3.2636 | 359 | 0.0024 | - | - |
3.2727 | 360 | 0.0052 | - | - |
3.2818 | 361 | 0.0032 | - | - |
3.2909 | 362 | 0.0024 | - | - |
3.3 | 363 | 0.0008 | - | - |
3.3091 | 364 | 0.0035 | - | - |
3.3182 | 365 | 0.0012 | - | - |
3.3273 | 366 | 0.0049 | - | - |
3.3364 | 367 | 0.0452 | - | - |
3.3455 | 368 | 0.0017 | - | - |
3.3545 | 369 | 0.0112 | - | - |
3.3636 | 370 | 0.0011 | - | - |
3.3727 | 371 | 0.0016 | - | - |
3.3818 | 372 | 0.0015 | - | - |
3.3909 | 373 | 0.004 | - | - |
3.4 | 374 | 0.0074 | - | - |
3.4091 | 375 | 0.0005 | - | - |
3.4182 | 376 | 0.0007 | - | - |
3.4273 | 377 | 0.0014 | - | - |
3.4364 | 378 | 0.0097 | - | - |
3.4455 | 379 | 0.0026 | - | - |
3.4545 | 380 | 0.0022 | - | - |
3.4636 | 381 | 0.001 | - | - |
3.4727 | 382 | 0.0004 | - | - |
3.4818 | 383 | 0.004 | - | - |
3.4909 | 384 | 0.0017 | - | - |
3.5 | 385 | 0.0014 | - | - |
3.5091 | 386 | 0.001 | - | - |
3.5182 | 387 | 0.0047 | - | - |
3.5273 | 388 | 0.0061 | - | - |
3.5364 | 389 | 0.0017 | - | - |
3.5455 | 390 | 0.0024 | - | - |
3.5545 | 391 | 0.0021 | - | - |
3.5636 | 392 | 0.0007 | - | - |
3.5727 | 393 | 0.0009 | - | - |
3.5818 | 394 | 0.0006 | - | - |
3.5909 | 395 | 0.0038 | - | - |
3.6 | 396 | 0.0006 | - | - |
3.6091 | 397 | 0.0011 | - | - |
3.6182 | 398 | 0.001 | - | - |
3.6273 | 399 | 0.0014 | - | - |
3.6364 | 400 | 0.0007 | - | - |
3.6455 | 401 | 0.0052 | - | - |
3.6545 | 402 | 0.0008 | - | - |
3.6636 | 403 | 0.0009 | - | - |
3.6727 | 404 | 0.0017 | - | - |
3.6818 | 405 | 0.0028 | - | - |
3.6909 | 406 | 0.0044 | - | - |
3.7 | 407 | 0.0009 | - | - |
3.7091 | 408 | 0.0134 | - | - |
3.7182 | 409 | 0.001 | - | - |
3.7273 | 410 | 0.0044 | - | - |
3.7364 | 411 | 0.0138 | - | - |
3.7455 | 412 | 0.0032 | - | - |
3.7545 | 413 | 0.0004 | - | - |
3.7636 | 414 | 0.0065 | - | - |
3.7727 | 415 | 0.0007 | - | - |
3.7818 | 416 | 0.0008 | - | - |
3.7909 | 417 | 0.0007 | - | - |
3.8 | 418 | 0.0018 | - | - |
3.8091 | 419 | 0.001 | - | - |
3.8182 | 420 | 0.0305 | - | - |
3.8273 | 421 | 0.001 | - | - |
3.8364 | 422 | 0.0011 | - | - |
3.8455 | 423 | 0.0004 | - | - |
3.8545 | 424 | 0.003 | - | - |
3.8636 | 425 | 0.002 | - | - |
3.8727 | 426 | 0.0018 | - | - |
3.8818 | 427 | 0.0968 | - | - |
3.8909 | 428 | 0.002 | - | - |
3.9 | 429 | 0.002 | - | - |
3.9091 | 430 | 0.0156 | - | - |
3.9182 | 431 | 0.0059 | - | - |
3.9273 | 432 | 0.001 | - | - |
3.9364 | 433 | 0.0153 | - | - |
3.9455 | 434 | 0.0013 | - | - |
3.9545 | 435 | 0.0003 | - | - |
3.9636 | 436 | 0.001 | - | - |
3.9727 | 437 | 0.0005 | - | - |
3.9818 | 438 | 0.0012 | - | - |
3.9909 | 439 | 0.0109 | - | - |
4.0 | 440 | 0.1597 | 0.0211 | 0.7235 |
4.0091 | 441 | 0.0027 | - | - |
4.0182 | 442 | 0.0007 | - | - |
4.0273 | 443 | 0.0089 | - | - |
4.0364 | 444 | 0.0007 | - | - |
4.0455 | 445 | 0.005 | - | - |
4.0545 | 446 | 0.0019 | - | - |
4.0636 | 447 | 0.0007 | - | - |
4.0727 | 448 | 0.0008 | - | - |
4.0818 | 449 | 0.002 | - | - |
4.0909 | 450 | 0.043 | - | - |
4.1 | 451 | 0.0273 | - | - |
4.1091 | 452 | 0.0009 | - | - |
4.1182 | 453 | 0.0011 | - | - |
4.1273 | 454 | 0.0007 | - | - |
4.1364 | 455 | 0.0062 | - | - |
4.1455 | 456 | 0.0004 | - | - |
4.1545 | 457 | 0.0008 | - | - |
4.1636 | 458 | 0.0128 | - | - |
4.1727 | 459 | 0.0012 | - | - |
4.1818 | 460 | 0.0013 | - | - |
4.1909 | 461 | 0.0009 | - | - |
4.2 | 462 | 0.0011 | - | - |
4.2091 | 463 | 0.0336 | - | - |
4.2182 | 464 | 0.0018 | - | - |
4.2273 | 465 | 0.0009 | - | - |
4.2364 | 466 | 0.0049 | - | - |
4.2455 | 467 | 0.0012 | - | - |
4.2545 | 468 | 0.001 | - | - |
4.2636 | 469 | 0.0024 | - | - |
4.2727 | 470 | 0.0063 | - | - |
4.2818 | 471 | 0.0008 | - | - |
4.2909 | 472 | 0.0793 | - | - |
4.3 | 473 | 0.0016 | - | - |
4.3091 | 474 | 0.0016 | - | - |
4.3182 | 475 | 0.0043 | - | - |
4.3273 | 476 | 0.036 | - | - |
4.3364 | 477 | 0.002 | - | - |
4.3455 | 478 | 0.0019 | - | - |
4.3545 | 479 | 0.0012 | - | - |
4.3636 | 480 | 0.0059 | - | - |
4.3727 | 481 | 0.0017 | - | - |
4.3818 | 482 | 0.0004 | - | - |
4.3909 | 483 | 0.0014 | - | - |
4.4 | 484 | 0.0143 | - | - |
4.4091 | 485 | 0.0014 | - | - |
4.4182 | 486 | 0.0009 | - | - |
4.4273 | 487 | 0.0027 | - | - |
4.4364 | 488 | 0.0017 | - | - |
4.4455 | 489 | 0.0007 | - | - |
4.4545 | 490 | 0.0008 | - | - |
4.4636 | 491 | 0.0008 | - | - |
4.4727 | 492 | 0.0014 | - | - |
4.4818 | 493 | 0.0011 | - | - |
4.4909 | 494 | 0.0013 | - | - |
4.5 | 495 | 0.0016 | - | - |
4.5091 | 496 | 0.001 | - | - |
4.5182 | 497 | 0.0008 | - | - |
4.5273 | 498 | 0.001 | - | - |
4.5364 | 499 | 0.0019 | - | - |
4.5455 | 500 | 0.0008 | - | - |
Framework Versions
- Python: 3.12.9
- Sentence Transformers: 4.1.0
- Transformers: 4.52.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}