SentenceTransformer
This is a sentence-transformers model trained on the cellxgene_pseudo_bulk_35k_pairs_natural_language_annotation and geo_70k_multiplets_natural_language_annotation_cs50 datasets. It maps sentences & paragraphs to a 2048-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: None tokens
- Output Dimensionality: 2048 dimensions
- Similarity Function: Cosine Similarity
- Training Datasets:
- Language: code
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): MMContextEncoder(
(text_encoder): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(30522, 768, padding_idx=0)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSdpaSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
(text_adapter): AdapterModule(
(net): Sequential(
(0): Linear(in_features=768, out_features=2048, bias=True)
(1): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(pooling): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the ๐ค Hub
model = SentenceTransformer("jo-mengr/mmcontext-cg_35k-geo_70k-natural_language_annotation-pubmedbert-2048-text_only_50-feat_cs")
# Run inference
sentences = [
'ACTB EEF1A1 RPL10 RPS12 RPL13 MALAT1 FTL RPL28 RPS27A SRGN TPT1 RPS19 RPS15 RPL29 TMSB10 RPL34 PRDX1 ALOX5AP MT-CYB SEPTIN7 TUBA1B RPL11 JUNB PTMA TMSB4X CD74 PKM VSIR SARAF MT-ND4L CFLAR HERPUD1 SLC2A3 HSPE1 ARGLU1 RPS24 ALDOA HSP90B1 RBPJ S100A6 KMT2E MDH1 VIM TNFRSF1B CELF2 ATP2B4 EPB41L2 SMAP2 POLR2F MYH9',
"This measurement was conducted with 10x 5' v1. Activated CD8-positive, alpha-beta T cell derived from blood of a 26-year-old male.",
"This measurement was conducted with 10x 3' v3. Activated fibroblast cell sample taken from the interventricular septum of a female human donor in her early 40s, with European self-reported ethnicity.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 2048]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Binary Classification
- Evaluated with
BinaryClassificationEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.6496 |
cosine_accuracy_threshold | 0.2496 |
cosine_f1 | 0.7139 |
cosine_f1_threshold | 0.1882 |
cosine_precision | 0.5882 |
cosine_recall | 0.9079 |
cosine_ap | 0.6458 |
cosine_mcc | 0.3243 |
Triplet
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.7154 |
Training Details
Training Datasets
cellxgene_pseudo_bulk_35k_pairs_natural_language_annotation
- Dataset: cellxgene_pseudo_bulk_35k_pairs_natural_language_annotation at 30b5d31
- Size: 54,124 training samples
- Columns:
sentence_1
,sentence_2
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence_1 sentence_2 label type string string float details - min: 283 characters
- mean: 308.56 characters
- max: 328 characters
- min: 97 characters
- mean: 248.58 characters
- max: 921 characters
- min: 0.0
- mean: 0.5
- max: 1.0
- Samples:
sentence_1 sentence_2 label CD74 H4C3 GAPDH TPT1 LAPTM5 TMSB10 HSP90AA1 H2AZ1 TSC22D3 HMGB2 DNAJB1 IRAG2 SH3BGRL3 COTL1 HLA-DRB5 ATF5 HERPUD1 FTL KLF6 FOS IFITM2 ISG20 PMAIP1 SMC4 RGS1 DUSP1 SRGN STK17B VIM RGS2 HSPB1 NUSAP1 IFITM3 ANP32E MKI67 BTG2 HSPH1 PLAC8 DUSP2 SMIM14 PTTG1 MZB1 DDIT3 GADD45B RIPOR2 SGK1 KLF2 MANF RAD21 PCLAF
This measurement was conducted with 10x 5' v1. A cycling tonsil germinal center B cell derived from a 3-year-old human male.
1.0
CD74 H4C3 GAPDH TPT1 LAPTM5 TMSB10 HSP90AA1 H2AZ1 TSC22D3 HMGB2 DNAJB1 IRAG2 SH3BGRL3 COTL1 HLA-DRB5 ATF5 HERPUD1 FTL KLF6 FOS IFITM2 ISG20 PMAIP1 SMC4 RGS1 DUSP1 SRGN STK17B VIM RGS2 HSPB1 NUSAP1 IFITM3 ANP32E MKI67 BTG2 HSPH1 PLAC8 DUSP2 SMIM14 PTTG1 MZB1 DDIT3 GADD45B RIPOR2 SGK1 KLF2 MANF RAD21 PCLAF
This measurement was conducted with 10x 5' v1. Plasmablast cells derived from a 3-year-old human with recurrent tonsillitis, characterized by productive, in-frame IGH with IGHD5-1801, IGHD5-501, IGHJ402, IGHV4-6102, IGKV1-33, IGKJ4, and class-switched to IgG1, as determined through single-cell transcriptomics matched with bulk and single-cell antibody repertoires.
0.0
REG1A LYZ ITLN2 REG1B PRSS3 GUCA2A TPT1 CLCA1 ITLN1 JUN SSR4 TMSB10 HSP90B1 RN7SKP176 LCN2 FOS FTL KRT18 HSP90AA1 GAPDH S100A6 XBP1 MT1G FOSB BTG2 IER2 XIST DPEP1 HERPUD1 CCND2 EGR1 IFI6 SH3BGRL3 ATF3 RETNLB SEC11C KLK12 KLF6 NFKBIA HM13 HSPB1 SPINK4 PRDX4 NR4A1 SOX4 MT2A SDF2L1 FKBP11 IFITM3 S100A11
This measurement was conducted with 10x 3' v3. Paneth cells derived from the terminal ileum of a female human in her sixth decade, with non-inflamed tissue, contributing to the understanding of cell-type-specific transcriptional heterogeneity in Crohn's disease.
1.0
- Loss:
ContrastiveLoss
with these parameters:{ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE", "margin": 0.5, "size_average": true }
geo_70k_multiplets_natural_language_annotation_cs50
- Dataset: geo_70k_multiplets_natural_language_annotation_cs50 at bb8f2e2
- Size: 62,038 training samples
- Columns:
anchor
,positive
,negative_1
, andnegative_2
- Approximate statistics based on the first 1000 samples:
anchor positive negative_1 negative_2 type string string string string details - min: 280 characters
- mean: 331.66 characters
- max: 462 characters
- min: 79 characters
- mean: 194.69 characters
- max: 752 characters
- min: 70 characters
- mean: 190.41 characters
- max: 729 characters
- min: 9 characters
- mean: 9.6 characters
- max: 10 characters
- Samples:
anchor positive negative_1 negative_2 IGKC IGHG1 IGHG1 COL1A2 HLA-C COL1A1 TALAM1 COL3A1 MT-RNR1 SERPINA3 S100B KRT6A COL6A1 CD44 COL6A2 S100A9 HLA-B LYZ TNC IGLC2 IFITM3 POSTN C1QC SAT1 LENG8 IGHA1 CCND1 KRT16 RPS9 BACE2 IGHG2 EGR1 IGFBP5 XIST HLA-B GOLGA8A PRRC2A COL12A1 FOS EDNRB IGFBP3 MAP1B HLA-B HLA-A SYNM IGLC1 ARRDC3 MT2A PECAM1 SPP1
This measurement was conducted with Illumina HiSeq 2000. Primary melanoma cell line from a patient with melanoma, a type of skin cancer characterized by uncontrolled cellular proliferation.
This measurement was conducted with Illumina HiSeq 4000. Human airway smooth muscle (HASM) cells from Patient 5 treated with GNAS-knockdown budesonide.
SRX101664
IGKC COL1A1 IGHG1 IGHG1 COL1A2 COL3A1 IGHA1 IGHG2 IGLC2 HLA-C COL6A2 TALAM1 COL6A1 HLA-B IGLC1 MT-RNR1 COL12A1 IGFBP5 KRT6A HLA-A COL5A1 TNC IFITM3 CD44 IGLL5 S100B S100A9 IGHA2 POSTN IGKV4-1 PECAM1 LENG8 RPS9 C1QC IGHG2 SAT1 LYZ HLA-C SERPINA3 PDGFRB HLA-B CCND1 FOS BACE2 FBLN2 IGHV3-33 KRT16 IFITM1 EDNRB HLA-C
This measurement was conducted with Illumina HiSeq 2000. Primary melanoma cell line sample from primary melanoma #2.
This measurement was conducted with Illumina NovaSeq 6000. 103-018 is a patient from which PBMC (Peripheral Blood Mononuclear Cells) cells were obtained and subjected to single cell RNA sequencing prior to vaccine treatment.
SRX101665
RGS2 MT-RNR1 DLGAP5 ENSG00000258232 CCNB1 FN1 NFKBIA RRM2 FDFT1 SPDL1 SAMHD1 UBE2C ASPM GNG12 EPAS1 PPFIA1 SOCS7 CSF1R TOP2A ENSG00000281383 SOCS2 CDC20 ENSG00000265401 POLR2A RGS16 CTSV ZNF280B GTF2IP4 CDC6 NUSAP1 MBP HSPA1B GLIPR1 IGSF10 ENSG00000273673 ENSG00000259357 MT2A SLC7A11 CHAC1 HSPA1B NUDT3 KIF20A LINC01304 RRN3 COCH ENSG00000259132 SERPINE2 ADGRG6 CDK1 EIF3CL
This measurement was conducted with Illumina HiSeq 2000. 2-cell embryo #3 - Cell #1, which is a human preimplantation blastomere at the early blastomere stage, derived from human preimplantation embryos.
This measurement was conducted with Illumina HiSeq 2000. Human preimplantation blastomere cells at the early blastomere stage, derived from human preimplantation embryos.
SRX130007
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Datasets
cellxgene_pseudo_bulk_35k_pairs_natural_language_annotation
- Dataset: cellxgene_pseudo_bulk_35k_pairs_natural_language_annotation at 30b5d31
- Size: 6,230 evaluation samples
- Columns:
sentence_1
,sentence_2
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence_1 sentence_2 label type string string float details - min: 276 characters
- mean: 303.74 characters
- max: 344 characters
- min: 90 characters
- mean: 211.0 characters
- max: 953 characters
- min: 0.0
- mean: 0.5
- max: 1.0
- Samples:
sentence_1 sentence_2 label MALAT1 CD74 EEF1A1 RPS12 RPL10 RPL13 RPL28 RPL34 RPS27A RPL11 ACTB RPS19 TMSB4X TPT1 BTG1 KLF2 RPL29 RPS15 HLA-DRB5 TSC22D3 RPS24 PTMA TXNIP TMSB10 CD37 LAPTM5 JUN FTL VIM CXCR4 MAP3K8 CD69 JUNB CYBA SARAF BASP1 EZR CIRBP LCP1 TCF4 CYTIP CD44 MEF2C SMCHD1 MTDH IFITM2 RCSD1 MDH1 HSPA8 DUSP1
This measurement was conducted with 10x 5' v1. Memory B cell sample taken from a 9-year old female tonsil, with IGHV3-1/IGLV3-1/IGKV1-5 antibody isotype.
1.0
MALAT1 CD74 EEF1A1 RPS12 RPL10 RPL13 RPL28 RPL34 RPS27A RPL11 ACTB RPS19 TMSB4X TPT1 BTG1 KLF2 RPL29 RPS15 HLA-DRB5 TSC22D3 RPS24 PTMA TXNIP TMSB10 CD37 LAPTM5 JUN FTL VIM CXCR4 MAP3K8 CD69 JUNB CYBA SARAF BASP1 EZR CIRBP LCP1 TCF4 CYTIP CD44 MEF2C SMCHD1 MTDH IFITM2 RCSD1 MDH1 HSPA8 DUSP1
This measurement was conducted with 10x 5' v1. A memory B cell derived from a 6-year old human individual, with IgM isotype, IGHJ402, IGHV4-5908, IGLC2, and IGLV1-44 IGLJ2 genes, and a junction length of 51 nucleotides.
0.0
MALAT1 GPC5 GRM3 EBF1 DACH1 ATP1A2 PRKG1 ZEB1 EPS8 PLCB4 LAMA2 GRM8 RGS5 LHFPL6 MAML2 ZFHX3 ZBTB20 INPP4B CALD1 SPARCL1 SLC6A1 DLEU2 UTRN IGFBP7 MT-ND4 GPC6 SLC1A3 COBLL1 PDZD2 PRKCH ARHGAP6 ARHGAP10 ACTB TIMP3 LPP ARHGAP42 ADAMTS9-AS2 NR2F2-AS1 SYNE2 MGLL SEMA5A MYO1B SOX5 ARHGAP29 RBMS1 COLEC12 PDE7B TCF4 MT-ND1 NEAT1
This measurement was conducted with 10x 3' v3. Mural cell from white matter of cerebellum from a 71-year-old male individual, preserved through cryopreservation.
1.0
- Loss:
ContrastiveLoss
with these parameters:{ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE", "margin": 0.5, "size_average": true }
geo_70k_multiplets_natural_language_annotation_cs50
- Dataset: geo_70k_multiplets_natural_language_annotation_cs50 at bb8f2e2
- Size: 6,872 evaluation samples
- Columns:
anchor
,positive
,negative_1
, andnegative_2
- Approximate statistics based on the first 1000 samples:
anchor positive negative_1 negative_2 type string string string string details - min: 279 characters
- mean: 312.83 characters
- max: 379 characters
- min: 80 characters
- mean: 191.16 characters
- max: 796 characters
- min: 79 characters
- mean: 182.15 characters
- max: 727 characters
- min: 9 characters
- mean: 10.36 characters
- max: 11 characters
- Samples:
anchor positive negative_1 negative_2 EEF1A1 HLA-C HLA-B EEF2 HLA-A RPL19 HLA-B HLA-B LCP1 PSAP HLA-A RPL29 S100A8 COTL1 RAC2 RPS15 S100A6 IFITM1 ATG16L2 RPL23AP42 TALDO1 HLA-A PTPRC RPS9 ARHGAP30 SUN2 HLA-DRA HLA-DRA HLA-DRA PTPN6 RGS2 FCGRT TSC22D3 SLC44A2 PTMA H3P6 RPL9 FLOT2 ENSG00000237550 CTSD HLA-DRB1 HNRNPK CHI3L1 RPS4Y1 SORL1 HLA-B HLA-B TRAF3IP3 UCP2 SSR2
This measurement was conducted with NextSeq 500. 1741_009 whole blood sample taken at 17 hours, with a local time of 0:00, 15 hours after the onset of salivary dim light melatonin onset (DLMO), serum melatonin level of 70.2 pg/mL, and belonging to the ontology terms "blood serum, blood".
This measurement was conducted with Illumina HiSeq 4000. 3-day post-symptom onset whole blood sample from a female patient with secondary dengue fever caused by DV1 virus.
SRX4017642
EEF1A1 MT2A RPL19 MT1X HLA-B CXCR4 HLA-A HLA-A RPS15 PTMA HLA-B HLA-C HLA-E SNHG29 HLA-B RPL3P4 HLA-B SAT1 HLA-DRB1 RPL9 HLA-DPA1 ENSG00000237550 HLA-DRA ANXA1 THBS1 EEF2 S100A10 PSAP NEAT1 RGCC CXCL8 S100A6 PIK3IP1 HLA-DPB1 SDCBP ENSG00000269968 LCP1 PTPRC RPLP0P9 SQSTM1 TALDO1 SLC3A2 HLA-DRA HLA-DRA HLA-DRA RPL41P2 LYSMD2 TNFAIP3 EIF3H EIF5
This measurement was conducted with NextSeq 550. 6-hour stimulated human peripheral blood mononuclear cells (PBMCs) from donor 1 with Interferon gamma + Golgi inhibitor.
This measurement was conducted with Illumina HiSeq 2500. 28.75076148 year old female human heart tissue from European descent, not treated.
SRX11174633
EEF1A1 IGFBP5 TUBA1B EEF2 CLU CTSD PTMA RPL19 S100A10 GANAB NCL HNRNPK STC2 SLC40A1 NEAT1 SLC7A5 MED13L PSAP NUMA1 S100A6 TSPAN13 RPL9 PLK2 RPL29 TOP2A CALM2 HMGN1 RBBP7 ANLN RAB11FIP1 AKT1 EFEMP1 SSR2 MAPRE1 TPR SLC44A2 CENPF RAB5C CANX PPP1CC WDR26 SEC24C TPX2 H2AZ1 EIF3H SUMF2 SLC3A2 FDFT1 ZWINT CDK1
This measurement was conducted with Illumina NovaSeq 6000. T47D cells, specifically the stable ESR1-e6>TCF12 fusion expressing subtype, cultured in hormone-deprived media with estrogen (E2) treatment.
This measurement was conducted with Illumina HiSeq 2500. Female breast epithelial tissue from an adult human, specifically from the basal location, obtained through mammoplasty reduction surgery. No treatment was reported for this sample.
SRX13441834
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 512per_device_eval_batch_size
: 512learning_rate
: 0.0002num_train_epochs
: 16warmup_ratio
: 0.1fp16
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 512per_device_eval_batch_size
: 512per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 0.0002weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 16max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | cellxgene pseudo bulk 35k pairs natural language annotation loss | geo 70k multiplets natural language annotation cs50 loss | cosine_ap | cosine_accuracy |
---|---|---|---|---|---|---|
0.2193 | 50 | 4.1334 | - | - | - | - |
0.4386 | 100 | 2.9835 | 0.1714 | 7.9182 | 0.5341 | 0.6011 |
0.6579 | 150 | 2.7435 | - | - | - | - |
0.8772 | 200 | 2.644 | 0.1408 | 7.3696 | 0.5438 | 0.6141 |
1.0965 | 250 | 1.7928 | - | - | - | - |
1.3158 | 300 | 1.994 | 0.0847 | 7.4545 | 0.5430 | 0.6157 |
1.5351 | 350 | 2.4587 | - | - | - | - |
1.7544 | 400 | 1.825 | 0.1020 | 7.9167 | 0.5489 | 0.6135 |
1.9737 | 450 | 1.7271 | - | - | - | - |
2.1930 | 500 | 1.7114 | 0.0775 | 7.3996 | 0.5484 | 0.6580 |
2.4123 | 550 | 1.4226 | - | - | - | - |
2.6316 | 600 | 1.5795 | 0.0821 | 7.8886 | 0.5492 | 0.6353 |
2.8509 | 650 | 1.3618 | - | - | - | - |
3.0702 | 700 | 1.4994 | 0.0937 | 7.5974 | 0.5699 | 0.6719 |
3.2895 | 750 | 1.1364 | - | - | - | - |
3.5088 | 800 | 1.4003 | 0.0870 | 7.3849 | 0.5765 | 0.6745 |
3.7281 | 850 | 1.1514 | - | - | - | - |
3.9474 | 900 | 1.1719 | 0.0743 | 7.5087 | 0.5698 | 0.6736 |
4.1667 | 950 | 1.1209 | - | - | - | - |
4.3860 | 1000 | 0.9998 | 0.0677 | 7.4844 | 0.5699 | 0.6641 |
4.6053 | 1050 | 1.0351 | - | - | - | - |
4.8246 | 1100 | 0.9489 | 0.0867 | 7.7366 | 0.5691 | 0.6717 |
5.0439 | 1150 | 1.2461 | - | - | - | - |
5.2632 | 1200 | 1.091 | 0.0873 | 7.4915 | 0.5736 | 0.6788 |
5.4825 | 1250 | 0.8159 | - | - | - | - |
5.7018 | 1300 | 0.9027 | 0.0796 | 7.4320 | 0.5841 | 0.6716 |
5.9211 | 1350 | 0.7999 | - | - | - | - |
6.1404 | 1400 | 0.7471 | 0.0869 | 7.4004 | 0.5829 | 0.6829 |
6.3596 | 1450 | 0.7991 | - | - | - | - |
6.5789 | 1500 | 0.8213 | 0.0648 | 7.3819 | 0.5817 | 0.6829 |
6.7982 | 1550 | 0.9306 | - | - | - | - |
7.0175 | 1600 | 0.7246 | 0.0618 | 7.3086 | 0.5921 | 0.6815 |
7.2368 | 1650 | 0.752 | - | - | - | - |
7.4561 | 1700 | 0.6573 | 0.0753 | 7.3722 | 0.5793 | 0.6810 |
7.6754 | 1750 | 0.8498 | - | - | - | - |
7.8947 | 1800 | 0.6371 | 0.0854 | 7.3546 | 0.5801 | 0.6845 |
8.1140 | 1850 | 0.6769 | - | - | - | - |
8.3333 | 1900 | 0.6239 | 0.0880 | 7.1975 | 0.5866 | 0.6919 |
8.5526 | 1950 | 0.7589 | - | - | - | - |
8.7719 | 2000 | 0.5615 | 0.0831 | 7.0378 | 0.5906 | 0.6991 |
8.9912 | 2050 | 0.7161 | - | - | - | - |
9.2105 | 2100 | 0.5501 | 0.0848 | 7.2601 | 0.5939 | 0.6854 |
9.4298 | 2150 | 0.6244 | - | - | - | - |
9.6491 | 2200 | 0.6028 | 0.0789 | 7.1073 | 0.5937 | 0.6931 |
9.8684 | 2250 | 0.5608 | - | - | - | - |
10.0877 | 2300 | 0.5667 | 0.0657 | 7.0947 | 0.5899 | 0.6815 |
10.3070 | 2350 | 0.5181 | - | - | - | - |
10.5263 | 2400 | 0.5437 | 0.0931 | 7.1204 | 0.6073 | 0.6922 |
10.7456 | 2450 | 0.467 | - | - | - | - |
10.9649 | 2500 | 0.6612 | 0.0877 | 6.8894 | 0.6050 | 0.7020 |
11.1842 | 2550 | 0.4867 | - | - | - | - |
11.4035 | 2600 | 0.4748 | 0.0948 | 6.8807 | 0.6051 | 0.6973 |
11.6228 | 2650 | 0.4308 | - | - | - | - |
11.8421 | 2700 | 0.5269 | 0.0785 | 6.7121 | 0.6132 | 0.6940 |
12.0614 | 2750 | 0.4727 | - | - | - | - |
12.2807 | 2800 | 0.4444 | 0.1072 | 6.7429 | 0.6183 | 0.7031 |
12.5 | 2850 | 0.3854 | - | - | - | - |
12.7193 | 2900 | 0.4202 | 0.0725 | 6.4541 | 0.6265 | 0.6998 |
12.9386 | 2950 | 0.4251 | - | - | - | - |
13.1579 | 3000 | 0.4774 | 0.0869 | 6.5179 | 0.6209 | 0.7029 |
13.3772 | 3050 | 0.2989 | - | - | - | - |
13.5965 | 3100 | 0.3446 | 0.0577 | 6.3269 | 0.6288 | 0.6966 |
13.8158 | 3150 | 0.3884 | - | - | - | - |
14.0351 | 3200 | 0.426 | 0.1001 | 6.4260 | 0.6389 | 0.7088 |
14.2544 | 3250 | 0.2721 | - | - | - | - |
14.4737 | 3300 | 0.3691 | 0.1048 | 6.3513 | 0.6413 | 0.7142 |
14.6930 | 3350 | 0.3212 | - | - | - | - |
14.9123 | 3400 | 0.3679 | 0.1045 | 6.3733 | 0.6440 | 0.7104 |
15.1316 | 3450 | 0.384 | - | - | - | - |
15.3509 | 3500 | 0.3666 | 0.0988 | 6.3379 | 0.6429 | 0.7122 |
15.5702 | 3550 | 0.2266 | - | - | - | - |
15.7895 | 3600 | 0.2685 | 0.1127 | 6.3607 | 0.6458 | 0.7154 |
Framework Versions
- Python: 3.11.6
- Sentence Transformers: 4.1.0.dev0
- Transformers: 4.52.3
- PyTorch: 2.7.0+cu126
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
ContrastiveLoss
@inproceedings{hadsell2006dimensionality,
author={Hadsell, R. and Chopra, S. and LeCun, Y.},
booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
title={Dimensionality Reduction by Learning an Invariant Mapping},
year={2006},
volume={2},
number={},
pages={1735-1742},
doi={10.1109/CVPR.2006.100}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Evaluation results
- Cosine Accuracy on Unknownself-reported0.650
- Cosine Accuracy Threshold on Unknownself-reported0.250
- Cosine F1 on Unknownself-reported0.714
- Cosine F1 Threshold on Unknownself-reported0.188
- Cosine Precision on Unknownself-reported0.588
- Cosine Recall on Unknownself-reported0.908
- Cosine Ap on Unknownself-reported0.646
- Cosine Mcc on Unknownself-reported0.324
- Cosine Accuracy on Unknownself-reported0.715