SentenceTransformer based on NeuML/pubmedbert-base-embeddings

This is a sentence-transformers model finetuned from NeuML/pubmedbert-base-embeddings on the geo_70k_multiplets_natural_language_annotation dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): MMContextEncoder(
    (text_encoder): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(30522, 768, padding_idx=0)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSdpaSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
    (pooling): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  )
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("jo-mengr/mmcontext-pubmedbert-70k")
# Run inference
sentences = [
    'TMSB4X GAPDH LYZ B2M IFI30 ARHGDIB ENSG00000283907 MALAT1 SRGN PSAP LCP1 CXCL8 UBC SH3BGRL3 TUBA1B CTSC S100A4 MYL6 S100A11 ANXA1 CAPG ENSG00000203396 CCL2 CD68 H3-3B MYL12A RAC2 PPIA GLUL TXN SERPINB1 ENSG00000225840 TUBB SLC25A5 HMGB2 S100A6 GSTP1 H2AZ1 NCF2 HSPA5 GPI GRN HSPA8 LAMP2 C6orf62 CAT STMN1 HNRNPH1 KPNA2 LTA4H HLA-B TPP1 CD44 MSRB1 NPC2 ATP6V0D1 H4C3 LITAF SLC2A3 DDX3X AP2M1 PSMB3 CD24 ATP5F1E',
    'This measurement was conducted with Illumina HiSeq 2500. CFU-G/M U2AF1 wild-type 2 cells, which are granulomonocytic progenitor cells that have been transduced with U2AF1 wild-type. These cells are commonly used in studies of common myeloid progenitors, which are CD34-positive cultured cells.',
    'This measurement was conducted with Illumina HiSeq 2500. UM-UC18 bladder cancer cell line, a type of urinary bladder cancer cell line, cultured for study of bladder disease, cancer cell proliferation, and neoplasm.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.1404, 0.1000],
#         [0.1404, 1.0000, 0.4225],
#         [0.1000, 0.4225, 1.0000]])

Evaluation

Metrics

Triplet

  • Dataset: geo_70k_multiplets_natural_language_annotation_cell_sentence_2
  • Evaluated with TripletEvaluator
Metric Value
cosine_accuracy 0.6657

Training Details

Training Dataset

geo_70k_multiplets_natural_language_annotation

  • Dataset: geo_70k_multiplets_natural_language_annotation at 4c62cd1
  • Size: 61,911 training samples
  • Columns: anchor, positive, negative_1, and negative_2
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1 negative_2
    type string string string string
    details
    • min: 354 characters
    • mean: 392.12 characters
    • max: 601 characters
    • min: 83 characters
    • mean: 189.5 characters
    • max: 698 characters
    • min: 100 characters
    • mean: 165.46 characters
    • max: 465 characters
    • min: 358 characters
    • mean: 435.31 characters
    • max: 456 characters
  • Samples:
    anchor positive negative_1 negative_2
    FTH1 FLNA TGFBI SLC7A5 KRT17 ENSG00000283907 COL5A1 FN1 ANXA2 KRT8 PLEC CLU AHNAK SCD TUBA1B ENSG00000225840 HSPA8 KRT7 CD59 FBN1 NEAT1 VIM PFN1 MALAT1 FLNB PRPF8 B2M FTL TMSB4X HSPG2 FASN BASP1 SLC3A2 ACTN4 PXDN SLC38A2 ACLY TUBB THBS1 LDHB SLC38A1 TPM2 ITGA5 PLS3 APP TIMP2 COL3A1 TXNRD1 LMNA ASPH CSRP1 COL12A1 MUC16 ACTN4 LDHA LDHA CPS1 TPM1 CANX COL4A2 PDE3A CANX HIF1A TALAM1 This measurement was conducted with Illumina HiSeq 2000. 5-day HeLa cell line with ELAVL1/HuR siRNA1 knockdown, 120 hours post-transfection. This measurement was conducted with Illumina HiSeq 2000. BJ fibroblast cells in a proliferative stage, with polyA RNA subtype. LINC03031 SINHCAFP3 COL1A1 COL6A3 ENSG00000261526 MYL9 COL1A2 MLLT3 CALD1 IGFBP5 ANK3 BTBD9 CCDC80 TPM1 TPM2 ENSG00000233388 MMP14 TAGLN THBS1 HSPA1B SKIL NEDD4L MARCKS MALAT1 ACTN4 ACTN1 PLTP MMP2 PXDN RAB11FIP1 CD63 IGFBP7 TUBB BLM FN1 PTMS MAP7-AS1 MYLK NCALD SPINT2 LMNA FTH1 ENSG00000267469 PRRG3 FSTL1 RNF187 MGLL CAVIN1 COL5A1 COL3A1 FGF23 ENSG00000226824 SPARC ASPH SLC16A3 LPAR4 ENSG00000269888 COL4A2 RRBP1 SH3PXD2A COL4A1 LBH SOX4 LDLR
    LINC03031 SINHCAFP3 ENSG00000261526 COL6A3 COL1A1 TPM1 MLLT3 MYL9 BTBD9 ENSG00000233388 COL1A2 CALD1 TPM2 TUBB HSPA1B ANK3 TAGLN IGFBP5 NEDD4L PRR11 ACTN4 SKIL MALAT1 FN1 PTMS MMP14 THBS1 ACTN1 NCALD PXDN CCDC80 MARCKS SPINT2 PRRG3 VIM RAB3B TUBB MAP7-AS1 SERPINE1 CD63 LMNA ASPH ENSG00000226824 RAB11FIP1 SLC16A3 CAVIN1 COL4A1 LPAR4 TOP2A FGF23 BLM COL4A2 MGLL NR6A1 FSTL1 MMP2 ENSG00000269888 ENSG00000267484 LOXL2 PLTP MYLK KIAA0825 ENSG00000267469 FTH1 This measurement was conducted with Illumina HiSeq 2000. BJ fibroblast cells in a proliferative stage, with polyA RNA subtype. This measurement was conducted with Illumina HiSeq 2000. 5-day HeLa cell line with ELAVL1/HuR siRNA1 knockdown, 120 hours post-transfection. LINC03031 SINHCAFP3 COL1A1 COL6A3 ENSG00000261526 MYL9 COL1A2 MLLT3 CALD1 IGFBP5 ANK3 BTBD9 CCDC80 TPM1 TPM2 ENSG00000233388 MMP14 TAGLN THBS1 HSPA1B SKIL NEDD4L MARCKS MALAT1 ACTN4 ACTN1 PLTP MMP2 PXDN RAB11FIP1 CD63 IGFBP7 TUBB BLM FN1 PTMS MAP7-AS1 MYLK NCALD SPINT2 LMNA FTH1 ENSG00000267469 PRRG3 FSTL1 RNF187 MGLL CAVIN1 COL5A1 COL3A1 FGF23 ENSG00000226824 SPARC ASPH SLC16A3 LPAR4 ENSG00000269888 COL4A2 RRBP1 SH3PXD2A COL4A1 LBH SOX4 LDLR
    LINC03031 SINHCAFP3 COL1A1 COL6A3 ENSG00000261526 MYL9 COL1A2 MLLT3 CALD1 IGFBP5 ANK3 BTBD9 CCDC80 TPM1 TPM2 ENSG00000233388 MMP14 TAGLN THBS1 HSPA1B SKIL NEDD4L MARCKS MALAT1 ACTN4 ACTN1 PLTP MMP2 PXDN RAB11FIP1 CD63 IGFBP7 TUBB BLM FN1 PTMS MAP7-AS1 MYLK NCALD SPINT2 LMNA FTH1 ENSG00000267469 PRRG3 FSTL1 RNF187 MGLL CAVIN1 COL5A1 COL3A1 FGF23 ENSG00000226824 SPARC ASPH SLC16A3 LPAR4 ENSG00000269888 COL4A2 RRBP1 SH3PXD2A COL4A1 LBH SOX4 LDLR This measurement was conducted with Illumina HiSeq 2000. BJ fibroblast cells at a confluent growth stage, with polyA RNA subtype. This measurement was conducted with Illumina HiSeq 2000. 5-day HeLa cell line with ELAVL1/HuR siRNA1 knockdown, 120 hours post-transfection. LINC03031 SINHCAFP3 ENSG00000261526 COL6A3 COL1A1 TPM1 MLLT3 MYL9 BTBD9 ENSG00000233388 COL1A2 CALD1 TPM2 TUBB HSPA1B ANK3 TAGLN IGFBP5 NEDD4L PRR11 ACTN4 SKIL MALAT1 FN1 PTMS MMP14 THBS1 ACTN1 NCALD PXDN CCDC80 MARCKS SPINT2 PRRG3 VIM RAB3B TUBB MAP7-AS1 SERPINE1 CD63 LMNA ASPH ENSG00000226824 RAB11FIP1 SLC16A3 CAVIN1 COL4A1 LPAR4 TOP2A FGF23 BLM COL4A2 MGLL NR6A1 FSTL1 MMP2 ENSG00000269888 ENSG00000267484 LOXL2 PLTP MYLK KIAA0825 ENSG00000267469 FTH1
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

geo_70k_multiplets_natural_language_annotation

  • Dataset: geo_70k_multiplets_natural_language_annotation at 4c62cd1
  • Size: 6,901 evaluation samples
  • Columns: anchor, positive, negative_1, and negative_2
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1 negative_2
    type string string string string
    details
    • min: 351 characters
    • mean: 395.91 characters
    • max: 472 characters
    • min: 78 characters
    • mean: 191.46 characters
    • max: 983 characters
    • min: 90 characters
    • mean: 217.63 characters
    • max: 702 characters
    • min: 364 characters
    • mean: 385.93 characters
    • max: 427 characters
  • Samples:
    anchor positive negative_1 negative_2
    IGF2 FN1 TF A2M C3 GAPDH AGT SCD LYZ SELENOP TUBB CP FTH1 RELN DSP SPTBN1 TMEM123 UBC LDHA LDHA TUBA1B MAP4K4 APLP2 PDIA6 HSPA5 H1-0 CCN2 SERPINF1 RRBP1 HNRNPH1 PPIA DHCR24 HSP90AA1 ACSL4 SLC38A2 APOE COL27A1 EIF3C BSG TOP2A RBM39 TGOLN2 ERBB3 CLU PNN MBNL3 GLUD1 LGALS3BP AHNAK PSAP CANX SNRNP200 NEAT1 CTSB ENSG00000283907 FAT1 PLOD2 TFPI NCOA4 LAMA5 FARP1 SLC2A1 ILF3 CALM3 This measurement was conducted with Illumina HiSeq 2000. 15-year-old male HepG2 immortalized cell line with hepatocellular carcinoma, transiently expressing shRNA targeting PKM2 for RNA-seq study. This measurement was conducted with Illumina HiSeq 2000. 15-year-old male patient with hepatocellular carcinoma; HNRNPC knocked down via shRNA in HepG2 (immortalized cell line) for RNA-seq analysis. FTH1 ENSG00000283907 GAPDH MALAT1 TALAM1 HSPA5 HMGA1 UBC PSAP TUBB ENSG00000225840 TUBA1B WARS1 HSPA9 PIM1 AARS1 NEAT1 CANX ASNS VEGFA GARS1 FLNA HSPA8 GLUL SND1 ANP32B TMSB4X SARS1 DDIT4 UBR4 HSP90AA1 ENSG00000258017 TXNIP PPIA EIF3C GSTP1 SCD EPB41 IARS1 GANAB STMN1 MCM7 PRKCB SLC25A6 EPRS1 MARS1 HNRNPF SLC25A5 TUBB COPA AHNAK VAT1 LDHA LDHA XIST HADHA PHGDH LCP1 GTPBP2 LONP1 TARS1 SLC1A5 LDHB SFPQ
    B2M TMSB4X HLA-B HSPA8 ETS1 HLA-E SARAF UBC HLA-A ENSG00000283907 HLA-A HSP90AA1 HLA-A LCP1 HLA-C GAPDH TMEM123 ENSG00000237550 PPIA CORO1A ARHGDIB TMSB10 FTH1 CDC42SE2 TXNIP CXCR4 SUN2 STAT1 CD44 ITM2B TUBB HLA-B TUBA1B IL32 LDHB TLE5 PCBP1 MBNL1 RIPOR2 H3-3B JAK1 HLA-B PSAP BTG1 LBH HSPA5 MYL6 HLA-C CANX RAC2 SAMHD1 SLC38A1 DDX3X HLA-C HNRNPF EPB41 SLC25A6 CTDSP2 SLC38A2 MATR3 SH3BGRL3 ESYT1 CCL5 TRAM1 This measurement was conducted with Illumina HiSeq 2000. 16-year-old female's T cells from a control group, stimulated with ag85 at timepoint 0, and primary cells. This measurement was conducted with Illumina HiSeq 2000. 17-year-old male's monocytes stimulated with mTb, taken at 180 days post-stimulation, as part of the control group in a study. CXCL8 B2M ENSG00000225840 HLA-DRA SRGN TXN H3-3B ID2 PLAUR PSAP ANXA2 GSTO1 DNAJB9 LGALS3 ADA UBC GAPDH SAT1 CTSL MYL6 NSMAF PMAIP1 FTH1 CD44 BID TMSB4X ARL8B ATP6V1B2 CD74 NAMPT HLA-E ISG20 CD83 CHCHD2 IFI30 ANXA1 BZW1 CSTB CXCL2 TXNRD1 NR3C1 NINJ1 EIF5 SLC3A2 GLIPR1 CDKN1A NRIP3 SLC7A11 TMED2 G0S2 LITAF LCP1 TMSB10 CREM HSPA8 HLA-B HSPA5 CD59 ENSG00000237550 NCOA4 LGALS1 ENSG00000203396 IFNGR1 ARF4
    THBS1 FLNA FTH1 FN1 GAPDH HLA-B TUBB HSP90AA1 PLEC TUBA1B AHNAK HLA-C TLN1 ANXA2 AXL HSPA8 TXNRD1 ACTN1 ACTN4 SLC7A5 TUBB CAVIN1 HUWE1 MPRIP MKI67 MYL6 LASP1 TGFBI TOP2A PPIA B2M LMNA TMSB4X TUBB4B KPNA2 COL6A2 GPRC5A COL5A1 HMGA1 NCAPD2 COL4A2 TMSB10 ENSG00000283907 SPTBN1 LAMC1 HSPA5 VCL ASPM PRPF8 UBR4 SPTAN1 ANLN CENPF CLIC4 COTL1 UBC AKR1B1 CALM3 PLS3 TRIO HIF1A CANX TGFB2 KTN1 This measurement was conducted with Illumina HiSeq 2500. UM-UC18 bladder cancer cell line, a type of urinary bladder cancer cell line, cultured for study of bladder disease, cancer cell proliferation, and neoplasm. This measurement was conducted with NextSeq 500. HeLa cells with PARP knockdown treatment. GAPDH HSPA8 TUBA1B SRGN HSP90AA1 TMSB4X TUBB ENSG00000237550 PPIA LYZ HNRNPH1 EMB LDHB HSPA5 LCP1 ARHGDIB STMN1 CANX MACROH2A1 TUBB ILF3 FLNA UBC GPI MCM4 B2M MAT2A TFRC MBNL1 H2AZ1 MCM7 XRCC5 HSPA9 IMPDH2 HMGA1 SFPQ CAT AHNAK CORO1A PRPF8 BSG MAN2B1 NOP56 SLC25A5 HMGB2 APEX1 H3-3B GPX1 COLGALT1 GANAB TCP1 HNRNPF SLC25A6 CAPRIN1 APLP2 MYC TUBB4B LDHA LDHA KIAA0100 GSTP1 SRSF2 TOP2A MKI67
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • learning_rate: 0.1
  • num_train_epochs: 4
  • warmup_ratio: 0.1
  • bf16: True
  • gradient_checkpointing: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.1
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: True
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss geo 70k multiplets natural language annotation loss geo_70k_multiplets_natural_language_annotation_cell_sentence_2_cosine_accuracy
0.4132 100 19.2596 20.4720 0.6657
0.8264 200 19.2622 20.4720 0.6657
1.2397 300 19.2606 20.4720 0.6657
1.6529 400 19.2546 20.4720 0.6657
2.0661 500 19.2629 20.4720 0.6657
2.4793 600 19.2551 20.4720 0.6657
2.8926 700 19.2647 20.4720 0.6657
3.3058 800 19.2696 20.4720 0.6657
3.7190 900 19.2583 20.4720 0.6657

Framework Versions

  • Python: 3.11.6
  • Sentence Transformers: 5.0.0
  • Transformers: 4.55.0.dev0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.9.0
  • Datasets: 2.19.1
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jo-mengr/mmcontext-pubmedbert-70k

Evaluation results

  • Cosine Accuracy on geo 70k multiplets natural language annotation cell sentence 2
    self-reported
    0.666