SentenceTransformer based on NeuML/pubmedbert-base-embeddings
This is a sentence-transformers model finetuned from NeuML/pubmedbert-base-embeddings on the train_MNR_hnm_scrna dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: NeuML/pubmedbert-base-embeddings
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- train_MNR_hnm_scrna
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("mariakrissmer/pubmedbert_jonatan_100k_50s_20250711")
# Run inference
sentences = [
'The pattern MALAT1, JUN, HSP90AA1, JUND, TMSB4X, MT-CO1, RPLP1, RPS27, HSP90AB1, UBC, TPT1, EIF1, H3-3B, DNAJB1, RPS12, RPL10, ACTB, RPL28, RPS3, HSPA6, RPS14, RPL13, MT-ATP6, RPS15, RPL30, FOS, JUNB, RPS27A, NFKBIA, ZFP36, RPS15A, RPL37, PTMA, RPS19, RPL34, RPL11, RPS8, RPL32, MT-CYB, RPL3, RPL19, YPEL5, DUSP1, FTH1, RPS28, TMSB10, DNAJA1, RPS13, HSPE1, KLF2 is indicative of gamma-delta T cell differentiation.',
'This profile resembles gamma-delta T cell cells, based on genes like MALAT1, TMSB4X, ACTB, MT-CO1, RPLP1, RPS27, RPS12, RPL10, RPL13, NKG7, RPL28, RPS15A, RPS14, CD74, RPS3, RPL12, MT-CYB, RPS27A, RPS19, RPS8, RPL32, RPL30, RPLP2, RPL19, FOS, JUN, PTMA, RPS28, PFN1, RPL34, GAPDH, FAU, MT-ATP6, TMSB10, RPL18, HSP90AA1, COTL1, RPS15, SH3BGRL3, RPL11, RPS23, RPL23A, DNAJB1, RPS24, RPL13A, RPL26, DUSP1, RPL36, H3-3B, RPS6.',
'The pattern MALAT1, RPL10, RPS27, RPL32, RPL34, RPS15A, RPL30, RPS28, RPS4X, RPS12, RPS19, RPLP1, RPL13, RPL19, RPL11, RPS14, MT-ND3, DNAJB1, RPS13, RPL26, TMSB4X, RPL3, MT-CYB, RPLP2, RPL14, ACTB, RPL8, MT-ND1, RPS15, RPL12, RPL18, RPL28, RPL36, TPT1, RPS6, RPS8, MT-CO1, RPS27A, RPS7, TMSB10, RPL37, PTMA, MT-ND2, RPS24, RPS3, FAU, RPS23, ARHGDIB, HSPE1, ZFP36L2 is indicative of regulatory T cell differentiation.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Dataset:
train_celltype_scrna_MNR
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 1.0 |
Training Details
Training Dataset
train_MNR_hnm_scrna
- Dataset: train_MNR_hnm_scrna
- Size: 303,605 training samples
- Columns:
sentence1
,sentence2
, andnegative
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 negative type string string string details - min: 160 tokens
- mean: 177.17 tokens
- max: 207 tokens
- min: 160 tokens
- mean: 177.56 tokens
- max: 208 tokens
- min: 160 tokens
- mean: 177.05 tokens
- max: 201 tokens
- Samples:
sentence1 sentence2 negative Based on the expression of MALAT1, TMSB4X, GNLY, RPS27, NKG7, MT-CO1, RPL10, RPL13A, RPLP2, RPL34, RPS27A, RPLP1, RPS14, RPS28, TMSB10, RPS12, RPL32, RPL13, RPS4X, RPL26, RPL23A, ACTB, RPS19, RPS15A, RPS6, RPL11, RPL37, RPS3, PTMA, RPL3, RPS24, RPL30, RPL28, RPL19, RPS15, RPL36, RPL35, RPS11, RPS8, RPS23, MT-ATP6, RPL18, RPS13, TPT1, GZMA, RPS7, RPL14, RPL37A, RPL12, MT-CYB, this appears to be a natural killer cell cell.
Genes like MALAT1, MT-CO1, TMSB4X, RPLP1, RPL10, NKG7, RPS15A, RPS3, RPL13, RPS19, RPL13A, RPL32, RPLP2, RPS6, RPS8, RPL23A, RPL3, RPL19, RPS14, RPS27, RPL34, RPS15, GNLY, TPT1, RPL35, RPS7, GZMA, RPL18, RPS16, RPL11, EIF1, RPS4X, UBB, RPS23, CMC1, ACTB, RPL28, PFN1, RPS13, RPL37, RPS28, KLRB1, RPS27A, CTSW, RPL14, MT-CYB, MT-ATP6, RPS12, RPL36, RPS11 are hallmarks of natural killer cell cells.
The expression of MALAT1, GNLY, RPL10, TSC22D3, FOS, TPT1, NFKBIA, RPLP1, TNFAIP3, RPS27, RPS12, RPL13, RPS27A, JUND, RPL28, ZFP36, H3-3B, RPS23, MT-CO1, RPS8, RPL30, VIM, PTMA, RPS19, RPS14, EIF1, JUN, RPL34, SRGN, RPS6, RPS28, RPL19, RPL32, RPS3, IER2, RPL3, BTG1, RPS15A, RPL12, MT2A, RPS24, RPL11, RPL8, IL7R, RPS15, CXCR4, RPL36, RPL37, CD44, RPL18 aligns with a CD16-negative, CD56-bright natural killer cell, human identity.
With genes like MALAT1, FTH1, FTL, SAT1, NEAT1, TMSB4X, MT-CO1, RPS19, TPT1, RPL10, RPL13, RPLP1, RPS27, RPL28, RPS8, RPL34, RPS12, RPS24, EIF1, TMSB10, MT-ND3, MT-ATP6, VIM, NAMPT, RPS13, RPS6, RPL32, RPS23, S100A6, RPS27A, RPL30, MT-CYB, RPS16, RPS15, RPL37, S100A4, RPS28, ACTB, RPS14, RPLP2, RPL12, HSP90AA1, RPL13A, RPL11, RPL8, RPL26, S100A11, RPL37A, SRGN, FAU active, this cell is identified as a non-classical monocyte.
FTL, FTH1, ACTB, MALAT1, TMSB10, TMSB4X, MT-CO1, RPL10, RPL28, RPS19, S100A4, S100A6, CST3, CD74, RPS12, COTL1, RPLP1, SAT1, FOS, RPS8, PFN1, SH3BGRL3, IFITM3, RPL30, TYROBP, RPL11, RPS24, IFITM2, RPS13, RPL32, FCER1G, RPL13, RPL34, RPS14, RPS23, RPL12, RPL19, VIM, RPS28, TPT1, PTMA, DUSP1, RPS27A, FAU, RPL8, S100A11, RPL37, PSAP, RPS4X, RPS15 reflect the unique expression profile of non-classical monocyte cells.
Observed genes (TMSB4X, MALAT1, FTL, FTH1, TMSB10, SPP1, ACTB, CD74, TPT1, RPL10, GAPDH, RPLP1, RPL13, LYZ, RPS12, VIM, RPL28, RPL13A, RPS14, LGALS1, RPS24, RPS19, RPL32, RPL11, CXCL8, RPLP2, S100A4, S100A6, RPS16, RPS6, RPL8, RPS23, CST3, RPL19, RPS27, PTMA, RPL34, SRGN, RPS15A, RPS27A, RPS3, S100A10, RPL12, RPS20, TYROBP, RPS8, RPL30, NFKBIA, RPS15, RPL26) are indicative of classical monocyte cell function.
Top genes from expression profile: MALAT1, HSP90AA1, RPLP1, RPL13, RPL10, MT-ND2, RPL34, RPS6, RPL13A, RPS27, RPS19, HSPB1, MT-CYB, RPLP2, RPL32, RPL26, RPS27A, RPS14, RPL14, RPS28, MT-ND3, RPL3, RPS15, RPL8, RPS8, UBB, RPL36, RPL11, MT-ATP6, HSP90AB1, RPS12, RPL37A, TPT1, RPS15A, RPL12, MT-ND1, ACTB, RPS4X, RPL5, TM4SF1, RPL23A, RPL28, RPL30, RPS7, RPL18, RPS24, HSPH1, RPS3, RPS13, RPS20.
Typical epithelial cell of urethra markers such as MALAT1, MT2A, MT1X, RPLP1, MT-CO1, RPS6, RPL34, TM4SF1, RPS19, RPL10, RPL13A, RPS8, RPS27, RPS28, RPL13, RPL11, MT-CYB, RPS4X, RPL3, MT-ND2, RPS27A, MT1M, RPS12, S100A11, ACTB, RPL36, RPL32, RPL28, RPL19, RPS14, RPL37A, RPS15A, FTH1, MT-ND3, RPS15, RPL12, RPL26, S100A6, RPS3, RPL8, RPS24, RPS13, RPL35, RPS23, ANXA2, NEAT1, MT-ATP6, RPL23A, ANXA1, RPLP2 are present in this cell.
The pattern MALAT1, RORA, CELF2, GFAP, PHACTR1, TTC3, MT-CYB, CACNA2D3, FAT3, AMZ2, MT-CO1, MT-ND3, ATP2B1, TTC28, TRPS1, DCLK1, PDZD2, ENTREP1, DAAM2, NRG1, SIPA1L1, LINC00609, FYN, CD44, MAP4, NLK, TSC22D1, GLIS3, RASD1, PDE10A, FGF1, AAK1, HIVEP3, HP1BP3, VPS13B, ITPKB, MTDH, EPS8, CAMK4, GRAMD2B, RPL8, NLGN4Y, GPRC5B, TRAPPC9, TANC2, BCL2, ARPP21, BNC2, CHD2, ZHX3 is indicative of astrocyte differentiation.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 33,734 evaluation samples
- Columns:
sentence1
,sentence2
, andnegative
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 negative type string string string details - min: 157 tokens
- mean: 176.98 tokens
- max: 200 tokens
- min: 158 tokens
- mean: 177.0 tokens
- max: 201 tokens
- min: 157 tokens
- mean: 177.29 tokens
- max: 201 tokens
- Samples:
sentence1 sentence2 negative A transcriptome with MALAT1, RPL13, RPS8, RPLP1, RPL10, RPS14, RPL32, RPS3, RPS4X, TPT1, RPS12, GAPDH, MT-CO1, RPL3, RPS19, ACTB, RPL11, PTMA, RPS6, RPL5, RPL19, RPL8, RPS7, RPL30, RPS23, RPL18, RPL28, RPS24, RPS15, RPS27A, RPS15A, RPL12, RPL34, RPS13, RPL37, RPL14, VIM, TMSB10, TUBA1B, RPL13A, RPL23A, RPS27, RPS16, RPL36, RPL35, NACA, RPS28, HSP90AB1, RPL37A, RPLP2 points toward mesothelial cell identity.
The combination of MT-CO1, RPL10, RPS8, RPS12, MALAT1, RPLP1, RPL13, RPL32, RPS14, RPL34, RPS23, RPS3, RPS27A, RPL11, RPL5, RPL28, RPL30, RPS24, TPT1, MT-CYB, RPS4X, ACTB, RPS19, RPS15A, RPS7, RPL12, RPL37, RPL8, RPL18, RPS13, RPL19, RPS27, VIM, PTMA, MT-ATP6, MT-ND3, RPS15, RPS6, RPL14, RPL3, RPS28, RPL36, GAPDH, NACA, RPL37A, FTH1, TMSB4X, TMSB10, RPLP2, H3-3B is characteristic for mesothelial cell cells.
MALAT1, RPLP1, RPL13A, RPL10, PTMA, RPS8, RPS19, RPL13, RPS27, TMSB4X, TMSB10, RPS27A, RPS3, RPS23, ACTB, RPS15A, RPS12, MT-CO1, RPL3, RPL32, RPS14, RPL37A, RPL34, RPS24, RPS6, FTH1, RPS28, LGALS1, RPL23A, H4C3, RPS16, RPL11, RPS7, RPLP2, RPS15, RPL30, RPL28, RPL19, TPT1, RPL37, VIM, RPS4X, RPS20, RPL35, RPL26, RPS13, H3-3B, RPS11, RPL18, GAPDH expression places this cell in the mesodermal cell category.
Consistent with memory B cell function, genes like MALAT1, CD74, RPS27, RPLP1, RPL32, RPL13, RPS8, RPS12, RPS28, RPS14, RPL37, RPS15A, RPS23, RPS19, RPL34, RPL10, RPS13, RPS3, RPS15, RPL11, RPL30, RPL28, RPL19, RPS27A, ACTB, RPS6, RPS11, RPLP2, RPS4X, TPT1, RPL37A, RPL12, RPL8, RPL18, RPL3, RPL36, RPL35, FAU, RPL13A, RPL23A, RPS7, RPL14, RPS24, TMSB4X, TXNIP, FTL, RPS16, RPL26, TMSB10, SMCHD1 are expressed.
The active transcriptional program includes: TMSB4X, CD74, ACTB, RPLP1, RPL28, RPL10, RPS27, RPS12, RPS6, RPS19, RPS24, RPS15A, RPL37, RPL19, RPL32, RPS8, RPL11, RPL13, GAPDH, RPS27A, RPS3, RPS15, RPL36, RPS14, RPS13, RPS23, RPL23A, RPL18, RPL34, RPL30, RPS7, PFN1, RPS28, RPL8, RPS4X, TPT1, FAU, RPL37A, PTMA, RPL5, RPL12, RPL35, RPLP2, FTH1, RPL3, RPL26, TMSB10, ARHGDIB, NACA, RPS16.
This cell shows significant expression of: RPL13A, RPL10, MALAT1, RPL13, RPL3, RPLP1, RPS3, RPS6, RPS12, RPS23, RPS14, RPS27A, RPS19, MT-CO1, RPS8, RPL32, RPL12, RPL8, RPS24, RPS15A, RPL11, RPL26, RPL28, RPL19, RPL34, FTL, RPS7, RPS15, RPL5, RPS4X, RPL37A, RPLP2, RPL14, RPL23A, PTMA, RPS28, RPS27, TPT1, RPL35, MT-ND1, RPL18, RPS16, RPL30, TMSB10, GAPDH, RPS20, RPL36, S100A10, RPS13, RPL37.
This cell likely originates from the CD8-positive, alpha-beta T cell family, based on expression of MALAT1, TMSB4X, RPS27, RPLP1, RPL13A, RPL10, RPL13, RPL28, RPS12, RPS15A, RPLP2, RPS19, RPS27A, RPL23A, ACTB, TPT1, RPS3, RPS14, RPL34, FTL, RPL19, BTG1, RPS6, RPL32, RPL30, RPL26, CXCR4, RPS24, RPL11, RPS20, TMSB10, RPL3, RPS16, RPS15, RPS23, PTMA, IL32, RPS8, RPL37, EIF1, RPL12, RPS4X, RPS28, FAU, FTH1, RPS7, RPL14, RPL37A, RPL18, RGS1.
This cell fits the molecular signature of CD8-positive, alpha-beta T cell, expressing MALAT1, TMSB4X, RPS27, RPL10, FTH1, RPS12, RPL13, RPLP1, TPT1, RPS27A, RPL30, RPS19, RPL28, RPS15A, RPS28, RPL34, RPL32, RPS3, H3-3B, RPL36, RPL11, RPS23, RPS4X, RPL37, RPS14, PTMA, RPL19, RPL26, H1-10, RPL14, RPL18, FAU, RPL3, RPS13, RPS15, RPS24, RPS8, EIF1, RPL12, BTG1, RPL35, RPL8, RPS7, RPL23A, MT-CO1, RPLP2, PABPC1, FTL, GAPDH, GNLY.
Based on the expression of RPLP1, RPL10, RPS27A, RPS12, RPS19, RPL32, ACTB, RPL13, MALAT1, MT-CO1, FTH1, RPS27, RPS6, RPS24, RPS23, RPS8, RPL12, RPS3, IL32, RPL30, RPS4X, RPL19, TPT1, PTMA, RPS28, RPL34, GAPDH, RPS14, HSP90AB1, RPL28, MT-ATP6, RPS13, RPL36, RPLP2, RPL18, FTL, RPS15A, RPL8, RPL14, RPL37A, RPL37, MT-CYB, MT-ND1, RPL5, TMSB4X, RPS15, EIF1, RPL3, RPS16, RPS7, this appears to be a activated CD8-positive, alpha-beta T cell cell.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 128num_train_epochs
: 5warmup_steps
: 1000fp16
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 128per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 1000log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | Validation Loss | train_celltype_scrna_MNR_cosine_accuracy |
---|---|---|---|---|
-1 | -1 | - | - | 0.9630 |
0.0422 | 100 | 3.9546 | - | - |
0.0843 | 200 | 3.1346 | - | - |
0.1265 | 300 | 2.8727 | - | - |
0.1686 | 400 | 2.7226 | - | - |
0.2108 | 500 | 2.6064 | - | - |
0.2530 | 600 | 2.5365 | - | - |
0.2951 | 700 | 2.4867 | - | - |
0.3373 | 800 | 2.4548 | - | - |
0.3794 | 900 | 2.3967 | - | - |
0.4216 | 1000 | 2.3868 | 0.5279 | 0.9870 |
0.4637 | 1100 | 2.379 | - | - |
0.5059 | 1200 | 2.3403 | - | - |
0.5481 | 1300 | 2.2755 | - | - |
0.5902 | 1400 | 2.2746 | - | - |
0.6324 | 1500 | 2.2424 | - | - |
0.6745 | 1600 | 2.2395 | - | - |
0.7167 | 1700 | 2.1945 | - | - |
0.7589 | 1800 | 2.1921 | - | - |
0.8010 | 1900 | 2.1667 | - | - |
0.8432 | 2000 | 2.1342 | 0.4337 | 0.9920 |
0.8853 | 2100 | 2.1578 | - | - |
0.9275 | 2200 | 2.1644 | - | - |
0.9696 | 2300 | 2.1519 | - | - |
1.0118 | 2400 | 2.1336 | - | - |
1.0540 | 2500 | 2.0602 | - | - |
1.0961 | 2600 | 2.06 | - | - |
1.1383 | 2700 | 2.0825 | - | - |
1.1804 | 2800 | 2.0668 | - | - |
1.2226 | 2900 | 2.0508 | - | - |
1.2648 | 3000 | 2.0198 | 0.3937 | 0.9960 |
1.3069 | 3100 | 2.0512 | - | - |
1.3491 | 3200 | 2.0265 | - | - |
1.3912 | 3300 | 2.02 | - | - |
1.4334 | 3400 | 1.9946 | - | - |
1.4755 | 3500 | 1.9963 | - | - |
1.5177 | 3600 | 1.9733 | - | - |
1.5599 | 3700 | 1.9667 | - | - |
1.6020 | 3800 | 1.9495 | - | - |
1.6442 | 3900 | 2.0596 | - | - |
1.6863 | 4000 | 1.991 | 0.3658 | 0.9970 |
1.7285 | 4100 | 1.9537 | - | - |
1.7707 | 4200 | 2.0264 | - | - |
1.8128 | 4300 | 2.0761 | - | - |
1.8550 | 4400 | 2.0179 | - | - |
1.8971 | 4500 | 2.0278 | - | - |
1.9393 | 4600 | 1.941 | - | - |
1.9815 | 4700 | 1.9431 | - | - |
2.0236 | 4800 | 1.9163 | - | - |
2.0658 | 4900 | 1.9238 | - | - |
2.1079 | 5000 | 1.8818 | 0.3461 | 1.0 |
2.1501 | 5100 | 1.866 | - | - |
2.1922 | 5200 | 1.8703 | - | - |
2.2344 | 5300 | 1.8705 | - | - |
2.2766 | 5400 | 1.858 | - | - |
2.3187 | 5500 | 1.8673 | - | - |
2.3609 | 5600 | 1.8582 | - | - |
2.4030 | 5700 | 1.8406 | - | - |
2.4452 | 5800 | 1.8394 | - | - |
2.4874 | 5900 | 1.8454 | - | - |
2.5295 | 6000 | 1.8401 | 0.3268 | 1.0 |
2.5717 | 6100 | 1.8322 | - | - |
2.6138 | 6200 | 1.8152 | - | - |
2.6560 | 6300 | 1.8198 | - | - |
2.6981 | 6400 | 1.8054 | - | - |
2.7403 | 6500 | 1.8043 | - | - |
2.7825 | 6600 | 1.8131 | - | - |
2.8246 | 6700 | 1.7786 | - | - |
2.8668 | 6800 | 1.7794 | - | - |
2.9089 | 6900 | 1.7992 | - | - |
2.9511 | 7000 | 1.7727 | 0.3135 | 1.0 |
2.9933 | 7100 | 1.8016 | - | - |
3.0354 | 7200 | 1.7505 | - | - |
3.0776 | 7300 | 1.7502 | - | - |
3.1197 | 7400 | 1.7718 | - | - |
3.1619 | 7500 | 1.7549 | - | - |
3.2040 | 7600 | 1.7349 | - | - |
3.2462 | 7700 | 1.7402 | - | - |
3.2884 | 7800 | 1.7415 | - | - |
3.3305 | 7900 | 1.7245 | - | - |
3.3727 | 8000 | 1.7306 | 0.3080 | 1.0 |
3.4148 | 8100 | 1.7204 | - | - |
3.4570 | 8200 | 1.7289 | - | - |
3.4992 | 8300 | 1.7305 | - | - |
3.5413 | 8400 | 1.7152 | - | - |
3.5835 | 8500 | 1.7294 | - | - |
3.6256 | 8600 | 1.7059 | - | - |
3.6678 | 8700 | 1.7249 | - | - |
3.7099 | 8800 | 1.6946 | - | - |
3.7521 | 8900 | 1.7373 | - | - |
3.7943 | 9000 | 1.7173 | 0.3033 | 1.0 |
3.8364 | 9100 | 1.7175 | - | - |
3.8786 | 9200 | 1.7084 | - | - |
3.9207 | 9300 | 1.7009 | - | - |
3.9629 | 9400 | 1.6909 | - | - |
4.0051 | 9500 | 1.7024 | - | - |
4.0472 | 9600 | 1.6897 | - | - |
4.0894 | 9700 | 1.6764 | - | - |
4.1315 | 9800 | 1.6893 | - | - |
4.1737 | 9900 | 1.7053 | - | - |
4.2159 | 10000 | 1.6724 | 0.3016 | 1.0 |
4.2580 | 10100 | 1.6864 | - | - |
4.3002 | 10200 | 1.6927 | - | - |
4.3423 | 10300 | 1.6982 | - | - |
4.3845 | 10400 | 1.6659 | - | - |
4.4266 | 10500 | 1.6673 | - | - |
4.4688 | 10600 | 1.6718 | - | - |
4.5110 | 10700 | 1.671 | - | - |
4.5531 | 10800 | 1.6891 | - | - |
4.5953 | 10900 | 1.6826 | - | - |
4.6374 | 11000 | 1.6792 | 0.3007 | 1.0 |
4.6796 | 11100 | 1.6586 | - | - |
4.7218 | 11200 | 1.6819 | - | - |
4.7639 | 11300 | 1.6717 | - | - |
4.8061 | 11400 | 1.6905 | - | - |
4.8482 | 11500 | 1.6601 | - | - |
4.8904 | 11600 | 1.6799 | - | - |
4.9325 | 11700 | 1.6712 | - | - |
4.9747 | 11800 | 1.6567 | - | - |
Framework Versions
- Python: 3.11.2
- Sentence Transformers: 4.0.2
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu124
- Accelerate: 1.4.0
- Datasets: 3.3.2
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 6
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for mariakrissmer/pubmedbert_jonatan_100k_50s_20250711
Finetuned
NeuML/pubmedbert-base-embeddings
Evaluation results
- Cosine Accuracy on train celltype scrna MNRself-reported1.000