metadata
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:4012
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: BAAI/bge-base-en-v1.5
widget:
- source_sentence: Do cephalopods use RNA editing less frequently than other species?
sentences:
- >-
Extensive messenger RNA editing generates transcript and protein
diversity in genes involved in neural excitability, as previously
described, as well as in genes participating in a broad range of other
cellular functions.
- >-
GV1001 is a 16-amino-acid vaccine peptide derived from the human
telomerase reverse transcriptase sequence. It has been developed as a
vaccine against various cancers.
- >-
Using acetyl-specific K516 antibodies, we show that acetylation of
endogenous S6K1 at this site is potently induced upon growth factor
stimulation. We propose that K516 acetylation may serve to modulate
important kinase-independent functions of S6K1 in response to growth
factor signalling. Following mitogen stimulation, S6Ks interact with the
p300 and p300/CBP-associated factor (PCAF) acetyltransferases. S6Ks can
be acetylated by p300 and PCAF in vitro and S6K acetylation is detected
in cells expressing p300
- source_sentence: Can pets affect infant microbiomed?
sentences:
- >-
Yes, exposure to household furry pets influences the gut microbiota of
infants.
- >-
Thiazovivin is a selective small molecule that directly targets
Rho-associated kinase (ROCK) and increases expression of pluripotency
factors.
- ' Here, we present evidence that the calcium/calmodulin-dependent protein kinase IV (CaMK4) is increased and required during Th17 cell differentiation. Inhibition of CaMK4 reduced Il17 transcription through decreased activation of the cAMP response element modulator a (CREM-a) and reduced activation of the AKT/mTOR pathway, which is known to enhance Th17 differentiation. CAMK4 knockdown and kinase-dead mutant inhibited crocin-mediated HO-1 expression, Nrf2 activation, and phosphorylation of Akt, indicating that HO-1 expression is mediated by CAMK4 and that Akt is a downstream mediator of CAMK4 in crocin signaling'
- source_sentence: >-
In what proportion of children with heart failure has Enalapril been shown
to be safe and effective?
sentences:
- >-
5-HT2A (5-hydroxytryptamine type 2a) receptor can be evaluated with the
[18F]altanserin.
- >-
In children with heart failure evidence of the effect of enalapril is
empirical. Enalapril was clinically safe and effective in 50% to 80% of
for children with cardiac failure secondary to congenital heart
malformations before and after cardiac surgery, impaired ventricular
function , valvar regurgitation, congestive cardiomyopathy, , arterial
hypertension, life-threatening arrhythmias coexisting with circulatory
insufficiency.
ACE inhibitors have shown a transient beneficial effect on heart failure
due to anticancer drugs and possibly a beneficial effect in muscular
dystrophy-associated cardiomyopathy, which deserves further studies.
- |-
necroptosis
apoptosis
pro-survival/inflammation NF-κB activation
- source_sentence: How are SAHFS created?
sentences:
- >-
In particular, up to 17% of neutrophil nuclei of healthy women exhibit a
drumstick-shaped appendage that contains the inactive X chromosome.
- >-
miR-1, miR-133, miR-208a, miR-206, miR-494, miR-146a, miR-222, miR-21,
miR-221, miR-20a, miR-133a, miR-133b, miR-23, miR-107 and miR-181 are
involved in exercise adaptation
- >-
Cellular senescence-associated heterochromatic foci (SAHFS) are a novel
type of chromatin condensation involving alterations of linker histone
H1 and linker DNA-binding proteins. SAHFS can be formed by a variety of
cell types, but their mechanism of action remains unclear.
- source_sentence: >-
What are the effects of the deletion of all three Pcdh clusters
(tricluster deletion) in mice?
sentences:
- >-
Multicluster Pcdh diversity is required for mouse olfactory neural
circuit assembly. The vertebrate clustered protocadherin (Pcdh) cell
surface proteins are encoded by three closely linked gene clusters
(Pcdhα, Pcdhβ, and Pcdhγ). Although deletion of individual Pcdh clusters
had subtle phenotypic consequences, the loss of all three clusters
(tricluster deletion) led to a severe axonal arborization defect and
loss of self-avoidance.
- >-
The myocyte enhancer factor-2 (MEF2) proteins are MADS-box transcription
factors that are essential for differentiation of all muscle lineages
but their mechanisms of action remain largely undefined. MEF2C
expression initiates cardiomyogenesis, resulting in the up-regulation of
Brachyury T, bone morphogenetic protein-4, Nkx2-5, GATA-4, cardiac
alpha-actin, and myosin heavy chain expression. Inactivation of the
MEF2C gene causes cardiac developmental arrest and severe downregulation
of a number of cardiac markers including atrial natriuretic factor
(ANF). BMP-2, a regulator of cardiac development during embryogenesis,
was shown to increase PI 3-kinase activity in cardiac precursor cells,
resulting in increased expression of sarcomeric myosin heavy chain (MHC)
and MEF-2A. Furthermore, expression of MEF-2A increased MHC expression
in a PI 3-kinase-dependent manner. Other studies showed that Gli2 and
MEF2C proteins form a complex, capable of synergizing on
cardiomyogenesis-related promoters. Dominant interference of
calcineurin/mAKAP binding blunts the increase in MEF2 transcriptional
activity seen during myoblast differentiation, as well as the expression
of endogenous MEF2-target genes. These findings show that MEF-2 can
direct early stages of cell differentiation into a cardiomyogenic
pathway.
- >-
Investigators proposed that there have been three extended periods in
the evolution of gene regulatory elements. Early vertebrate evolution
was characterized by regulatory gains near transcription factors and
developmental genes, but this trend was replaced by innovations near
extracellular signaling genes, and then innovations near
posttranslational protein modifiers.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: BGE Base Biomedical MRL
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy@1
value: 0.7524752475247525
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8628005657708628
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8995756718528995
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9222065063649222
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7524752475247525
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.5973597359735974
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.5162659123055162
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.3977369165487977
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.2341252729014147
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.3973567239272255
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.4854465714352775
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.6062286842357961
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6940262144509974
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.813453896410049
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6257720133395309
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy@1
value: 0.7538896746817539
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8585572842998586
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8953323903818954
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9207920792079208
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7538896746817539
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.5964167845355963
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.5142857142857143
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.3977369165487976
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.2333750448173818
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.39469849211764985
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.4795534995350502
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.604605995019471
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6913260437859404
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8125008419209268
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6197252995126041
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy@1
value: 0.7355021216407355
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8486562942008486
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8868458274398868
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9137199434229137
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7355021216407355
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.5818010372465817
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.5018387553041018
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.38896746817538896
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.22793567434972276
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.37898311614248786
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.4645337797167325
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.5878379619993058
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6742555106189646
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7975213847915401
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6002622001635138
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy@1
value: 0.7057991513437057
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8132956152758133
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8500707213578501
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.8953323903818954
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7057991513437057
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.5535124941065536
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.47355021216407356
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.36605374823196607
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.2151445774205944
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.3572108621267904
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.4326304442151515
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.5469428314195238
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6357672212173915
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7700377180575202
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.5565977979127998
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy@1
value: 0.6265912305516266
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.7666195190947667
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.809052333804809
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.85997171145686
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6265912305516266
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.5002357378595002
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.4291371994342292
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.3312588401697313
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.18851019998558088
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.31756777149198423
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.38736111738995704
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.49729865330882483
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.5709082950268725
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7040951033878895
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.4804032390680516
name: Cosine Map@100
BGE Base Biomedical MRL
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("potsu-potsu/bge-base-biomedical-matryoshka")
sentences = [
'What are the effects of the deletion of all three Pcdh clusters (tricluster deletion) in mice?',
'Multicluster Pcdh diversity is required for mouse olfactory neural circuit assembly. The vertebrate clustered protocadherin (Pcdh) cell surface proteins are encoded by three closely linked gene clusters (Pcdhα, Pcdhβ, and Pcdhγ). Although deletion of individual Pcdh clusters had subtle phenotypic consequences, the loss of all three clusters (tricluster deletion) led to a severe axonal arborization defect and loss of self-avoidance.',
'Investigators proposed that there have been three extended periods in the evolution of gene regulatory elements. Early vertebrate evolution was characterized by regulatory gains near transcription factors and developmental genes, but this trend was replaced by innovations near extracellular signaling genes, and then innovations near posttranslational protein modifiers.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
Evaluation
Metrics
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.7525 |
cosine_accuracy@3 |
0.8628 |
cosine_accuracy@5 |
0.8996 |
cosine_accuracy@10 |
0.9222 |
cosine_precision@1 |
0.7525 |
cosine_precision@3 |
0.5974 |
cosine_precision@5 |
0.5163 |
cosine_precision@10 |
0.3977 |
cosine_recall@1 |
0.2341 |
cosine_recall@3 |
0.3974 |
cosine_recall@5 |
0.4854 |
cosine_recall@10 |
0.6062 |
cosine_ndcg@10 |
0.694 |
cosine_mrr@10 |
0.8135 |
cosine_map@100 |
0.6258 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.7539 |
cosine_accuracy@3 |
0.8586 |
cosine_accuracy@5 |
0.8953 |
cosine_accuracy@10 |
0.9208 |
cosine_precision@1 |
0.7539 |
cosine_precision@3 |
0.5964 |
cosine_precision@5 |
0.5143 |
cosine_precision@10 |
0.3977 |
cosine_recall@1 |
0.2334 |
cosine_recall@3 |
0.3947 |
cosine_recall@5 |
0.4796 |
cosine_recall@10 |
0.6046 |
cosine_ndcg@10 |
0.6913 |
cosine_mrr@10 |
0.8125 |
cosine_map@100 |
0.6197 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.7355 |
cosine_accuracy@3 |
0.8487 |
cosine_accuracy@5 |
0.8868 |
cosine_accuracy@10 |
0.9137 |
cosine_precision@1 |
0.7355 |
cosine_precision@3 |
0.5818 |
cosine_precision@5 |
0.5018 |
cosine_precision@10 |
0.389 |
cosine_recall@1 |
0.2279 |
cosine_recall@3 |
0.379 |
cosine_recall@5 |
0.4645 |
cosine_recall@10 |
0.5878 |
cosine_ndcg@10 |
0.6743 |
cosine_mrr@10 |
0.7975 |
cosine_map@100 |
0.6003 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.7058 |
cosine_accuracy@3 |
0.8133 |
cosine_accuracy@5 |
0.8501 |
cosine_accuracy@10 |
0.8953 |
cosine_precision@1 |
0.7058 |
cosine_precision@3 |
0.5535 |
cosine_precision@5 |
0.4736 |
cosine_precision@10 |
0.3661 |
cosine_recall@1 |
0.2151 |
cosine_recall@3 |
0.3572 |
cosine_recall@5 |
0.4326 |
cosine_recall@10 |
0.5469 |
cosine_ndcg@10 |
0.6358 |
cosine_mrr@10 |
0.77 |
cosine_map@100 |
0.5566 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.6266 |
cosine_accuracy@3 |
0.7666 |
cosine_accuracy@5 |
0.8091 |
cosine_accuracy@10 |
0.86 |
cosine_precision@1 |
0.6266 |
cosine_precision@3 |
0.5002 |
cosine_precision@5 |
0.4291 |
cosine_precision@10 |
0.3313 |
cosine_recall@1 |
0.1885 |
cosine_recall@3 |
0.3176 |
cosine_recall@5 |
0.3874 |
cosine_recall@10 |
0.4973 |
cosine_ndcg@10 |
0.5709 |
cosine_mrr@10 |
0.7041 |
cosine_map@100 |
0.4804 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 4,012 training samples
- Columns:
anchor
and positive
- Approximate statistics based on the first 1000 samples:
|
anchor |
positive |
type |
string |
string |
details |
- min: 5 tokens
- mean: 16.13 tokens
- max: 49 tokens
|
- min: 3 tokens
- mean: 63.38 tokens
- max: 485 tokens
|
- Samples:
anchor |
positive |
What is the implication of histone lysine methylation in medulloblastoma? |
Aberrant patterns of H3K4, H3K9, and H3K27 histone lysine methylation were shown to result in histone code alterations, which induce changes in gene expression, and affect the proliferation rate of cells in medulloblastoma. |
What is the role of STAG1/STAG2 proteins in differentiation? |
STAG1/STAG2 proteins are tumour suppressor proteins that suppress cell proliferation and are essential for differentiation. |
What is the association between cell phone use and glioblastoma? |
The association between cell phone use and incident glioblastoma remains unclear. Some studies have reported that cell phone use was associated with incident glioblastoma, and with reduced survival of patients diagnosed with glioblastoma. However, other studies have repeatedly replicated to find an association between cell phone use and glioblastoma. |
- Loss:
MatryoshkaLoss
with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epoch
per_device_train_batch_size
: 32
per_device_eval_batch_size
: 16
gradient_accumulation_steps
: 16
learning_rate
: 2e-05
num_train_epochs
: 4
lr_scheduler_type
: cosine
warmup_ratio
: 0.1
bf16
: True
tf32
: True
load_best_model_at_end
: True
optim
: adamw_torch_fused
batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: False
do_predict
: False
eval_strategy
: epoch
prediction_loss_only
: True
per_device_train_batch_size
: 32
per_device_eval_batch_size
: 16
per_gpu_train_batch_size
: None
per_gpu_eval_batch_size
: None
gradient_accumulation_steps
: 16
eval_accumulation_steps
: None
torch_empty_cache_steps
: None
learning_rate
: 2e-05
weight_decay
: 0.0
adam_beta1
: 0.9
adam_beta2
: 0.999
adam_epsilon
: 1e-08
max_grad_norm
: 1.0
num_train_epochs
: 4
max_steps
: -1
lr_scheduler_type
: cosine
lr_scheduler_kwargs
: {}
warmup_ratio
: 0.1
warmup_steps
: 0
log_level
: passive
log_level_replica
: warning
log_on_each_node
: True
logging_nan_inf_filter
: True
save_safetensors
: True
save_on_each_node
: False
save_only_model
: False
restore_callback_states_from_checkpoint
: False
no_cuda
: False
use_cpu
: False
use_mps_device
: False
seed
: 42
data_seed
: None
jit_mode_eval
: False
use_ipex
: False
bf16
: True
fp16
: False
fp16_opt_level
: O1
half_precision_backend
: auto
bf16_full_eval
: False
fp16_full_eval
: False
tf32
: True
local_rank
: 0
ddp_backend
: None
tpu_num_cores
: None
tpu_metrics_debug
: False
debug
: []
dataloader_drop_last
: False
dataloader_num_workers
: 0
dataloader_prefetch_factor
: None
past_index
: -1
disable_tqdm
: False
remove_unused_columns
: True
label_names
: None
load_best_model_at_end
: True
ignore_data_skip
: False
fsdp
: []
fsdp_min_num_params
: 0
fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap
: None
accelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed
: None
label_smoothing_factor
: 0.0
optim
: adamw_torch_fused
optim_args
: None
adafactor
: False
group_by_length
: False
length_column_name
: length
ddp_find_unused_parameters
: None
ddp_bucket_cap_mb
: None
ddp_broadcast_buffers
: False
dataloader_pin_memory
: True
dataloader_persistent_workers
: False
skip_memory_metrics
: True
use_legacy_prediction_loop
: False
push_to_hub
: False
resume_from_checkpoint
: None
hub_model_id
: None
hub_strategy
: every_save
hub_private_repo
: None
hub_always_push
: False
gradient_checkpointing
: False
gradient_checkpointing_kwargs
: None
include_inputs_for_metrics
: False
include_for_metrics
: []
eval_do_concat_batches
: True
fp16_backend
: auto
push_to_hub_model_id
: None
push_to_hub_organization
: None
mp_parameters
:
auto_find_batch_size
: False
full_determinism
: False
torchdynamo
: None
ray_scope
: last
ddp_timeout
: 1800
torch_compile
: False
torch_compile_backend
: None
torch_compile_mode
: None
include_tokens_per_second
: False
include_num_input_tokens_seen
: False
neftune_noise_alpha
: None
optim_target_modules
: None
batch_eval_metrics
: False
eval_on_start
: False
use_liger_kernel
: False
eval_use_gather_object
: False
average_tokens_across_devices
: False
prompts
: None
batch_sampler
: no_duplicates
multi_dataset_batch_sampler
: proportional
Training Logs
Epoch |
Step |
Training Loss |
dim_768_cosine_ndcg@10 |
dim_512_cosine_ndcg@10 |
dim_256_cosine_ndcg@10 |
dim_128_cosine_ndcg@10 |
dim_64_cosine_ndcg@10 |
1.0 |
8 |
- |
0.7106 |
0.7071 |
0.683 |
0.6384 |
0.5326 |
1.2540 |
10 |
25.4992 |
- |
- |
- |
- |
- |
2.0 |
16 |
- |
0.6976 |
0.6942 |
0.6763 |
0.6375 |
0.5635 |
2.5079 |
20 |
11.3871 |
- |
- |
- |
- |
- |
3.0 |
24 |
- |
0.6940 |
0.6907 |
0.6745 |
0.6365 |
0.5697 |
3.7619 |
30 |
8.6795 |
- |
- |
- |
- |
- |
4.0 |
32 |
- |
0.6940 |
0.6913 |
0.6743 |
0.6358 |
0.5709 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.12.5
- Sentence Transformers: 4.1.0
- Transformers: 4.52.4
- PyTorch: 2.7.1+cu128
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}