metadata
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:4012
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: Do cephalopods use RNA editing less frequently than other species?
sentences:
- >-
Extensive messenger RNA editing generates transcript and protein
diversity in genes involved in neural excitability, as previously
described, as well as in genes participating in a broad range of other
cellular functions.
- >-
GV1001 is a 16-amino-acid vaccine peptide derived from the human
telomerase reverse transcriptase sequence. It has been developed as a
vaccine against various cancers.
- >-
Using acetyl-specific K516 antibodies, we show that acetylation of
endogenous S6K1 at this site is potently induced upon growth factor
stimulation. We propose that K516 acetylation may serve to modulate
important kinase-independent functions of S6K1 in response to growth
factor signalling. Following mitogen stimulation, S6Ks interact with the
p300 and p300/CBP-associated factor (PCAF) acetyltransferases. S6Ks can
be acetylated by p300 and PCAF in vitro and S6K acetylation is detected
in cells expressing p300
- source_sentence: Can pets affect infant microbiomed?
sentences:
- >-
Yes, exposure to household furry pets influences the gut microbiota of
infants.
- >-
Thiazovivin is a selective small molecule that directly targets
Rho-associated kinase (ROCK) and increases expression of pluripotency
factors.
- ' Here, we present evidence that the calcium/calmodulin-dependent protein kinase IV (CaMK4) is increased and required during Th17 cell differentiation. Inhibition of CaMK4 reduced Il17 transcription through decreased activation of the cAMP response element modulator a (CREM-a) and reduced activation of the AKT/mTOR pathway, which is known to enhance Th17 differentiation. CAMK4 knockdown and kinase-dead mutant inhibited crocin-mediated HO-1 expression, Nrf2 activation, and phosphorylation of Akt, indicating that HO-1 expression is mediated by CAMK4 and that Akt is a downstream mediator of CAMK4 in crocin signaling'
- source_sentence: >-
In what proportion of children with heart failure has Enalapril been shown
to be safe and effective?
sentences:
- >-
5-HT2A (5-hydroxytryptamine type 2a) receptor can be evaluated with the
[18F]altanserin.
- >-
In children with heart failure evidence of the effect of enalapril is
empirical. Enalapril was clinically safe and effective in 50% to 80% of
for children with cardiac failure secondary to congenital heart
malformations before and after cardiac surgery, impaired ventricular
function , valvar regurgitation, congestive cardiomyopathy, , arterial
hypertension, life-threatening arrhythmias coexisting with circulatory
insufficiency.
ACE inhibitors have shown a transient beneficial effect on heart failure
due to anticancer drugs and possibly a beneficial effect in muscular
dystrophy-associated cardiomyopathy, which deserves further studies.
- |-
necroptosis
apoptosis
pro-survival/inflammation NF-κB activation
- source_sentence: How are SAHFS created?
sentences:
- >-
In particular, up to 17% of neutrophil nuclei of healthy women exhibit a
drumstick-shaped appendage that contains the inactive X chromosome.
- >-
miR-1, miR-133, miR-208a, miR-206, miR-494, miR-146a, miR-222, miR-21,
miR-221, miR-20a, miR-133a, miR-133b, miR-23, miR-107 and miR-181 are
involved in exercise adaptation
- >-
Cellular senescence-associated heterochromatic foci (SAHFS) are a novel
type of chromatin condensation involving alterations of linker histone
H1 and linker DNA-binding proteins. SAHFS can be formed by a variety of
cell types, but their mechanism of action remains unclear.
- source_sentence: >-
What are the effects of the deletion of all three Pcdh clusters
(tricluster deletion) in mice?
sentences:
- >-
Multicluster Pcdh diversity is required for mouse olfactory neural
circuit assembly. The vertebrate clustered protocadherin (Pcdh) cell
surface proteins are encoded by three closely linked gene clusters
(Pcdhα, Pcdhβ, and Pcdhγ). Although deletion of individual Pcdh clusters
had subtle phenotypic consequences, the loss of all three clusters
(tricluster deletion) led to a severe axonal arborization defect and
loss of self-avoidance.
- >-
The myocyte enhancer factor-2 (MEF2) proteins are MADS-box transcription
factors that are essential for differentiation of all muscle lineages
but their mechanisms of action remain largely undefined. MEF2C
expression initiates cardiomyogenesis, resulting in the up-regulation of
Brachyury T, bone morphogenetic protein-4, Nkx2-5, GATA-4, cardiac
alpha-actin, and myosin heavy chain expression. Inactivation of the
MEF2C gene causes cardiac developmental arrest and severe downregulation
of a number of cardiac markers including atrial natriuretic factor
(ANF). BMP-2, a regulator of cardiac development during embryogenesis,
was shown to increase PI 3-kinase activity in cardiac precursor cells,
resulting in increased expression of sarcomeric myosin heavy chain (MHC)
and MEF-2A. Furthermore, expression of MEF-2A increased MHC expression
in a PI 3-kinase-dependent manner. Other studies showed that Gli2 and
MEF2C proteins form a complex, capable of synergizing on
cardiomyogenesis-related promoters. Dominant interference of
calcineurin/mAKAP binding blunts the increase in MEF2 transcriptional
activity seen during myoblast differentiation, as well as the expression
of endogenous MEF2-target genes. These findings show that MEF-2 can
direct early stages of cell differentiation into a cardiomyogenic
pathway.
- >-
Investigators proposed that there have been three extended periods in
the evolution of gene regulatory elements. Early vertebrate evolution
was characterized by regulatory gains near transcription factors and
developmental genes, but this trend was replaced by innovations near
extracellular signaling genes, and then innovations near
posttranslational protein modifiers.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: Biomedical MRL
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy@1
value: 0.8500707213578501
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9377652050919377
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9504950495049505
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9674681753889675
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.8500707213578501
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.3125884016973126
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19009900990099007
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09674681753889673
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.8500707213578501
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9377652050919377
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9504950495049505
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9674681753889675
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9123173189785756
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8941778361509621
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8951587766172264
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy@1
value: 0.8486562942008486
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9349363507779349
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9519094766619519
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9674681753889675
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.8486562942008486
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.3116454502593116
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19038189533239033
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09674681753889672
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.8486562942008486
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9349363507779349
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9519094766619519
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9674681753889675
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9119495367876664
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8937164634830831
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8948057981361003
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy@1
value: 0.8373408769448374
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9278642149929278
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9434229137199435
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9547383309759547
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.8373408769448374
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.3092880716643093
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.18868458274398867
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09547383309759547
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.8373408769448374
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9278642149929278
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9434229137199435
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9547383309759547
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9017656707014216
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8841539255966414
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8857155093016021
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy@1
value: 0.8189533239038189
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9108910891089109
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9278642149929278
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9405940594059405
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.8189533239038189
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.30363036303630364
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.18557284299858556
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09405940594059405
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.8189533239038189
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9108910891089109
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9278642149929278
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9405940594059405
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.8856187513669239
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8673553579847783
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.869253499575075
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy@1
value: 0.7736916548797736
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8882602545968883
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9108910891089109
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.925035360678925
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7736916548797736
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.2960867515322961
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.18217821782178212
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09250353606789247
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.7736916548797736
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.8882602545968883
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9108910891089109
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.925035360678925
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.8573911656884706
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.834872926068117
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8366311237261763
name: Cosine Map@100
Biomedical MRL
This is a sentence-transformers model trained on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("potsu-potsu/medembed-base-mrl")
sentences = [
'What are the effects of the deletion of all three Pcdh clusters (tricluster deletion) in mice?',
'Multicluster Pcdh diversity is required for mouse olfactory neural circuit assembly. The vertebrate clustered protocadherin (Pcdh) cell surface proteins are encoded by three closely linked gene clusters (Pcdhα, Pcdhβ, and Pcdhγ). Although deletion of individual Pcdh clusters had subtle phenotypic consequences, the loss of all three clusters (tricluster deletion) led to a severe axonal arborization defect and loss of self-avoidance.',
'Investigators proposed that there have been three extended periods in the evolution of gene regulatory elements. Early vertebrate evolution was characterized by regulatory gains near transcription factors and developmental genes, but this trend was replaced by innovations near extracellular signaling genes, and then innovations near posttranslational protein modifiers.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
Evaluation
Metrics
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.8501 |
cosine_accuracy@3 |
0.9378 |
cosine_accuracy@5 |
0.9505 |
cosine_accuracy@10 |
0.9675 |
cosine_precision@1 |
0.8501 |
cosine_precision@3 |
0.3126 |
cosine_precision@5 |
0.1901 |
cosine_precision@10 |
0.0967 |
cosine_recall@1 |
0.8501 |
cosine_recall@3 |
0.9378 |
cosine_recall@5 |
0.9505 |
cosine_recall@10 |
0.9675 |
cosine_ndcg@10 |
0.9123 |
cosine_mrr@10 |
0.8942 |
cosine_map@100 |
0.8952 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.8487 |
cosine_accuracy@3 |
0.9349 |
cosine_accuracy@5 |
0.9519 |
cosine_accuracy@10 |
0.9675 |
cosine_precision@1 |
0.8487 |
cosine_precision@3 |
0.3116 |
cosine_precision@5 |
0.1904 |
cosine_precision@10 |
0.0967 |
cosine_recall@1 |
0.8487 |
cosine_recall@3 |
0.9349 |
cosine_recall@5 |
0.9519 |
cosine_recall@10 |
0.9675 |
cosine_ndcg@10 |
0.9119 |
cosine_mrr@10 |
0.8937 |
cosine_map@100 |
0.8948 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.8373 |
cosine_accuracy@3 |
0.9279 |
cosine_accuracy@5 |
0.9434 |
cosine_accuracy@10 |
0.9547 |
cosine_precision@1 |
0.8373 |
cosine_precision@3 |
0.3093 |
cosine_precision@5 |
0.1887 |
cosine_precision@10 |
0.0955 |
cosine_recall@1 |
0.8373 |
cosine_recall@3 |
0.9279 |
cosine_recall@5 |
0.9434 |
cosine_recall@10 |
0.9547 |
cosine_ndcg@10 |
0.9018 |
cosine_mrr@10 |
0.8842 |
cosine_map@100 |
0.8857 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.819 |
cosine_accuracy@3 |
0.9109 |
cosine_accuracy@5 |
0.9279 |
cosine_accuracy@10 |
0.9406 |
cosine_precision@1 |
0.819 |
cosine_precision@3 |
0.3036 |
cosine_precision@5 |
0.1856 |
cosine_precision@10 |
0.0941 |
cosine_recall@1 |
0.819 |
cosine_recall@3 |
0.9109 |
cosine_recall@5 |
0.9279 |
cosine_recall@10 |
0.9406 |
cosine_ndcg@10 |
0.8856 |
cosine_mrr@10 |
0.8674 |
cosine_map@100 |
0.8693 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.7737 |
cosine_accuracy@3 |
0.8883 |
cosine_accuracy@5 |
0.9109 |
cosine_accuracy@10 |
0.925 |
cosine_precision@1 |
0.7737 |
cosine_precision@3 |
0.2961 |
cosine_precision@5 |
0.1822 |
cosine_precision@10 |
0.0925 |
cosine_recall@1 |
0.7737 |
cosine_recall@3 |
0.8883 |
cosine_recall@5 |
0.9109 |
cosine_recall@10 |
0.925 |
cosine_ndcg@10 |
0.8574 |
cosine_mrr@10 |
0.8349 |
cosine_map@100 |
0.8366 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 4,012 training samples
- Columns:
anchor
and positive
- Approximate statistics based on the first 1000 samples:
|
anchor |
positive |
type |
string |
string |
details |
- min: 5 tokens
- mean: 16.13 tokens
- max: 49 tokens
|
- min: 3 tokens
- mean: 63.38 tokens
- max: 485 tokens
|
- Samples:
anchor |
positive |
What is the implication of histone lysine methylation in medulloblastoma? |
Aberrant patterns of H3K4, H3K9, and H3K27 histone lysine methylation were shown to result in histone code alterations, which induce changes in gene expression, and affect the proliferation rate of cells in medulloblastoma. |
What is the role of STAG1/STAG2 proteins in differentiation? |
STAG1/STAG2 proteins are tumour suppressor proteins that suppress cell proliferation and are essential for differentiation. |
What is the association between cell phone use and glioblastoma? |
The association between cell phone use and incident glioblastoma remains unclear. Some studies have reported that cell phone use was associated with incident glioblastoma, and with reduced survival of patients diagnosed with glioblastoma. However, other studies have repeatedly replicated to find an association between cell phone use and glioblastoma. |
- Loss:
MatryoshkaLoss
with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epoch
per_device_train_batch_size
: 32
per_device_eval_batch_size
: 16
gradient_accumulation_steps
: 16
learning_rate
: 2e-05
num_train_epochs
: 4
lr_scheduler_type
: cosine
warmup_ratio
: 0.1
bf16
: True
tf32
: True
load_best_model_at_end
: True
optim
: adamw_torch_fused
batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: False
do_predict
: False
eval_strategy
: epoch
prediction_loss_only
: True
per_device_train_batch_size
: 32
per_device_eval_batch_size
: 16
per_gpu_train_batch_size
: None
per_gpu_eval_batch_size
: None
gradient_accumulation_steps
: 16
eval_accumulation_steps
: None
torch_empty_cache_steps
: None
learning_rate
: 2e-05
weight_decay
: 0.0
adam_beta1
: 0.9
adam_beta2
: 0.999
adam_epsilon
: 1e-08
max_grad_norm
: 1.0
num_train_epochs
: 4
max_steps
: -1
lr_scheduler_type
: cosine
lr_scheduler_kwargs
: {}
warmup_ratio
: 0.1
warmup_steps
: 0
log_level
: passive
log_level_replica
: warning
log_on_each_node
: True
logging_nan_inf_filter
: True
save_safetensors
: True
save_on_each_node
: False
save_only_model
: False
restore_callback_states_from_checkpoint
: False
no_cuda
: False
use_cpu
: False
use_mps_device
: False
seed
: 42
data_seed
: None
jit_mode_eval
: False
use_ipex
: False
bf16
: True
fp16
: False
fp16_opt_level
: O1
half_precision_backend
: auto
bf16_full_eval
: False
fp16_full_eval
: False
tf32
: True
local_rank
: 0
ddp_backend
: None
tpu_num_cores
: None
tpu_metrics_debug
: False
debug
: []
dataloader_drop_last
: False
dataloader_num_workers
: 0
dataloader_prefetch_factor
: None
past_index
: -1
disable_tqdm
: False
remove_unused_columns
: True
label_names
: None
load_best_model_at_end
: True
ignore_data_skip
: False
fsdp
: []
fsdp_min_num_params
: 0
fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap
: None
accelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed
: None
label_smoothing_factor
: 0.0
optim
: adamw_torch_fused
optim_args
: None
adafactor
: False
group_by_length
: False
length_column_name
: length
ddp_find_unused_parameters
: None
ddp_bucket_cap_mb
: None
ddp_broadcast_buffers
: False
dataloader_pin_memory
: True
dataloader_persistent_workers
: False
skip_memory_metrics
: True
use_legacy_prediction_loop
: False
push_to_hub
: False
resume_from_checkpoint
: None
hub_model_id
: None
hub_strategy
: every_save
hub_private_repo
: None
hub_always_push
: False
gradient_checkpointing
: False
gradient_checkpointing_kwargs
: None
include_inputs_for_metrics
: False
include_for_metrics
: []
eval_do_concat_batches
: True
fp16_backend
: auto
push_to_hub_model_id
: None
push_to_hub_organization
: None
mp_parameters
:
auto_find_batch_size
: False
full_determinism
: False
torchdynamo
: None
ray_scope
: last
ddp_timeout
: 1800
torch_compile
: False
torch_compile_backend
: None
torch_compile_mode
: None
include_tokens_per_second
: False
include_num_input_tokens_seen
: False
neftune_noise_alpha
: None
optim_target_modules
: None
batch_eval_metrics
: False
eval_on_start
: False
use_liger_kernel
: False
eval_use_gather_object
: False
average_tokens_across_devices
: False
prompts
: None
batch_sampler
: no_duplicates
multi_dataset_batch_sampler
: proportional
Training Logs
Epoch |
Step |
Training Loss |
dim_768_cosine_ndcg@10 |
dim_512_cosine_ndcg@10 |
dim_256_cosine_ndcg@10 |
dim_128_cosine_ndcg@10 |
dim_64_cosine_ndcg@10 |
1.0 |
8 |
- |
0.9142 |
0.9151 |
0.905 |
0.8892 |
0.8474 |
1.2540 |
10 |
26.698 |
- |
- |
- |
- |
- |
2.0 |
16 |
- |
0.9120 |
0.9093 |
0.8999 |
0.8869 |
0.8568 |
2.5079 |
20 |
11.062 |
- |
- |
- |
- |
- |
3.0 |
24 |
- |
0.9116 |
0.9113 |
0.9009 |
0.8849 |
0.8572 |
3.7619 |
30 |
9.198 |
- |
- |
- |
- |
- |
4.0 |
32 |
- |
0.9123 |
0.9119 |
0.9018 |
0.8856 |
0.8574 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.12.6
- Sentence Transformers: 4.1.0
- Transformers: 4.52.4
- PyTorch: 2.6.0+cu124
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}