CrossEncoder based on answerdotai/ModernBERT-base

This is a Cross Encoder model finetuned from answerdotai/ModernBERT-base on the msmarco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: answerdotai/ModernBERT-base
  • Maximum Sequence Length: 8192 tokens
  • Number of Output Labels: 1 label
  • Training Dataset:
  • Language: en

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("tomaarsen/reranker-msmarco-ModernBERT-base-lambdaloss")
# Get scores for pairs of texts
pairs = [
    ['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
    ['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
    ['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (3,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'How many calories in an egg',
    [
        'There are on average between 55 and 80 calories in an egg depending on its size.',
        'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
        'Most of the calories in an egg come from the yellow yolk in the center.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

  • Datasets: NanoMSMARCO_R100, NanoNFCorpus_R100 and NanoNQ_R100
  • Evaluated with CrossEncoderRerankingEvaluator with these parameters:
    {
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric NanoMSMARCO_R100 NanoNFCorpus_R100 NanoNQ_R100
map 0.6768 (+0.1872) 0.3576 (+0.0966) 0.7134 (+0.2938)
mrr@10 0.6690 (+0.1915) 0.5819 (+0.0820) 0.7402 (+0.3135)
ndcg@10 0.7251 (+0.1847) 0.4143 (+0.0892) 0.7594 (+0.2587)

Cross Encoder Nano BEIR

  • Dataset: NanoBEIR_R100_mean
  • Evaluated with CrossEncoderNanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "msmarco",
            "nfcorpus",
            "nq"
        ],
        "rerank_k": 100,
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric Value
map 0.5826 (+0.1925)
mrr@10 0.6637 (+0.1957)
ndcg@10 0.6329 (+0.1776)

Training Details

Training Dataset

msmarco

  • Dataset: msmarco at a0537b6
  • Size: 399,282 training samples
  • Columns: query_id, doc_ids, and labels
  • Approximate statistics based on the first 1000 samples:
    query_id doc_ids labels
    type string list list
    details
    • min: 6 characters
    • mean: 33.0 characters
    • max: 154 characters
    • min: 6 elements
    • mean: 13.23 elements
    • max: 20 elements
    • min: 6 elements
    • mean: 13.23 elements
    • max: 20 elements
  • Samples:
    query_id doc_ids labels
    intel current gen core processors ["Identical or more capable versions of Core processors are also sold as Xeon processors for the server and workstation markets. As of 2017 the current lineup of Core processors included the Intel Core i7, Intel Core i5, and Intel Core i3, along with the Y - Series Intel Core CPU's.", "Most noticeably that Panasonic switched from Intel Core 2 Duo power to the latest Intel Core i3 and i5 processors. The three processors available in the new Toughbook 31, together with the new Mobile Intel QM57 Express chipset, are all part of Intel's Calpella platform.", 'The new 7th Gen Intel Core i7-7700HQ processor gives the 14-inch Razer Blade 2.8GHz of quad-core processing power and Turbo Boost speeds, which automatically increases the speed of active cores â\x80\x93 up to 3.8GHz.', 'Key difference: Intel Core i3 is a type of dual-core processor. i5 processors have 2 to 4 cores. A dual-core processor is a type of a central processing unit (CPU) that has two complete execution cores. Hence, it has t... [1, 0, 0, 0, 0, ...]
    renovation definition ['Renovation is the act of renewing or restoring something. If your kitchen is undergoing a renovation, thereâ\x80\x99s probably plaster and paint all over the place and you should probably get take-out.', 'NEW GALLERY SPACES OPENING IN 2017. In early 2017, our fourth floor will be transformed into a new destination for historical education and innovation. During the current renovation, objects from our permanent collection are on view throughout the Museum.', 'A same level house extension in Australia will cost approximately $60,000 to $200,000+. Adding a room or extending your living area on the ground floor are affordable ways of creating more space.Here are some key points to consider that will help you keep your renovation costs in check.RTICLE Stephanie Matheson. A same level house extension in Australia will cost approximately $60,000 to $200,000+. Adding a room or extending your living area on the ground floor are affordable ways of creating more space. Here are some key points... [1, 0, 0, 0, 0, ...]
    what is a girasol ['Girasol definition, an opal that reflects light in a bright luminous glow. See more.', 'Also, a type of opal from Mexico, referred to as Mexican water opal, is a colorless opal which exhibits either a bluish or golden internal sheen. Girasol opal is a term sometimes mistakenly and improperly used to refer to fire opals, as well as a type of transparent to semitransparent type milky quartz from Madagascar which displays an asterism, or star effect, when cut properly.', 'What is the meaning of Girasol? How popular is the baby name Girasol? Learn the origin and popularity plus how to pronounce Girasol', 'There are 5 basic types of opal. These types are Peruvian Opal, Fire Opal, Girasol Opal, Common opal and Precious Opal. There are 5 basic types of opal. These types are Peruvian Opal, Fire Opal, Girasol Opal, Common opal and Precious Opal.', 'girasol (Ë\x88dÊ\x92ɪrÉ\x99Ë\x8csÉ\x92l; -Ë\x8csÉ\x99Ê\x8al) , girosol or girasole n (Jewellery) a type of opal that has a red or pink glow in br... [1, 0, 0, 0, 0, ...]
  • Loss: LambdaLoss with these parameters:
    {
        "weighting_scheme": "sentence_transformers.cross_encoder.losses.LambdaLoss.NDCGLoss2PPScheme",
        "k": null,
        "sigma": 1.0,
        "eps": 1e-10,
        "reduction_log": "binary",
        "activation_fct": "torch.nn.modules.linear.Identity",
        "mini_batch_size": 8
    }
    

Evaluation Dataset

msmarco

  • Dataset: msmarco at a0537b6
  • Size: 1,000 evaluation samples
  • Columns: query_id, doc_ids, and labels
  • Approximate statistics based on the first 1000 samples:
    query_id doc_ids labels
    type string list list
    details
    • min: 10 characters
    • mean: 33.63 characters
    • max: 137 characters
    • min: 3 elements
    • mean: 12.50 elements
    • max: 20 elements
    • min: 3 elements
    • mean: 12.50 elements
    • max: 20 elements
  • Samples:
    query_id doc_ids labels
    can marijuana help dementia ["Cannabis 'could stop dementia in its tracks'. Cannabis may help keep Alzheimer's disease at bay. In experiments, a marijuana-based medicine triggered the formation of new brain cells and cut inflammation linked to dementia. The researchers say that using the information to create a pill suitable for people could help prevent or delay the onset of Alzheimer's.", 'Marijuana (cannabis): Marijuana in any form is not allowed on aircraft and is not allowed in the secure part of the airport (beyond the TSA screening areas). In addition it is illegal to import marijuana or marijuana-related items into the US.', 'Depakote and dementia - Can dementia be cured? Unfortunately, no. Dementia is a progressive disease. Even available treatments only slow progression or tame symptoms.', 'Marijuana Prices. The price of marijuana listed below is the typical price to buy marijuana on the black market in U.S. dollars. How much marijuana cost and the sale price of marijuana are based upon the United Natio... [1, 0, 0, 0, 0, ...]
    what are carcinogen ['Written By: Carcinogen, any of a number of agents that can cause cancer in humans. They can be divided into three major categories: chemical carcinogens (including those from biological sources), physical carcinogens, and oncogenic (cancer-causing) viruses. 1 Most carcinogens, singly or in combination, produce cancer by interacting with DNA in cells and thereby interfering with normal cellular function.', 'Tarragon (Artemisia dracunculus) is a species of perennial herb in the sunflower family. It is widespread in the wild across much of Eurasia and North America, and is cultivated for culinary and medicinal purposes in many lands.One sub-species, Artemisia dracunculus var. sativa, is cultivated for use of the leaves as an aromatic culinary herb.arragon has an aromatic property reminiscent of anise, due to the presence of estragole, a known carcinogen and teratogen in mice. The European Union investigation revealed that the danger of estragole is minimal even at 100â\x80\x931,000 tim... [1, 0, 0, 0, 0, ...]
    who played ben geller in friends ["Noelle and Cali aren't the only twins to have played one child character in Friends. Double vision: Ross' cheeky son Ben (pictured), from his first marriage to Carol, was also played by twins, Dylan and Cole Sprouse, who are now 22.", 'Update 7/29/06: There are now three â\x80\x9cTeaching Pastorsâ\x80\x9d at Applegate Christian Fellowship, according to their web site. Jon Courson is now back at Applegate. The other two listed as Teaching Pastors are Jonâ\x80\x99s two sons: Peter John and Ben Courson.on Courson has been appreciated over the years by many people who are my friends and whom I respect. I believe that he preaches the real Jesus and the true Gospel, for which I rejoice. I also believe that his ministry and church organization is a reasonable example with which to examine important issues together.', 'Ben 10 (Reboot) Ben 10: Omniverse is the fourth iteration of the Ben 10 franchise, and it is the sequel of Ben 10: Ultimate Alien. Ben was all set to be a solo hero with his n... [1, 0, 0, 0, 0, ...]
  • Loss: LambdaLoss with these parameters:
    {
        "weighting_scheme": "sentence_transformers.cross_encoder.losses.LambdaLoss.NDCGLoss2PPScheme",
        "k": null,
        "sigma": 1.0,
        "eps": 1e-10,
        "reduction_log": "binary",
        "activation_fct": "torch.nn.modules.linear.Identity",
        "mini_batch_size": 8
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss NanoMSMARCO_R100_ndcg@10 NanoNFCorpus_R100_ndcg@10 NanoNQ_R100_ndcg@10 NanoBEIR_R100_mean_ndcg@10
-1 -1 - - 0.0234 (-0.5170) 0.3412 (+0.0161) 0.0321 (-0.4686) 0.1322 (-0.3231)
0.0000 1 0.8349 - - - - -
0.0040 200 0.8417 - - - - -
0.0080 400 0.8371 - - - - -
0.0120 600 0.8288 - - - - -
0.0160 800 0.8076 - - - - -
0.0200 1000 0.7802 0.7316 0.2004 (-0.3400) 0.3110 (-0.0140) 0.2594 (-0.2413) 0.2569 (-0.1984)
0.0240 1200 0.6988 - - - - -
0.0280 1400 0.4688 - - - - -
0.0321 1600 0.3742 - - - - -
0.0361 1800 0.3441 - - - - -
0.0401 2000 0.3058 0.1975 0.6091 (+0.0687) 0.3978 (+0.0727) 0.6645 (+0.1639) 0.5571 (+0.1018)
0.0441 2200 0.2812 - - - - -
0.0481 2400 0.2748 - - - - -
0.0521 2600 0.2518 - - - - -
0.0561 2800 0.2591 - - - - -
0.0601 3000 0.2508 0.1673 0.7137 (+0.1733) 0.3980 (+0.0730) 0.7471 (+0.2464) 0.6196 (+0.1642)
0.0641 3200 0.2446 - - - - -
0.0681 3400 0.2385 - - - - -
0.0721 3600 0.2381 - - - - -
0.0761 3800 0.2204 - - - - -
0.0801 4000 0.221 0.1757 0.6321 (+0.0916) 0.3937 (+0.0687) 0.7029 (+0.2023) 0.5762 (+0.1209)
0.0841 4200 0.2131 - - - - -
0.0882 4400 0.2222 - - - - -
0.0922 4600 0.2307 - - - - -
0.0962 4800 0.2104 - - - - -
0.1002 5000 0.2151 0.1697 0.6388 (+0.0984) 0.3846 (+0.0595) 0.6659 (+0.1653) 0.5631 (+0.1077)
0.1042 5200 0.208 - - - - -
0.1082 5400 0.2147 - - - - -
0.1122 5600 0.2114 - - - - -
0.1162 5800 0.2224 - - - - -
0.1202 6000 0.2094 0.1583 0.6165 (+0.0761) 0.3969 (+0.0718) 0.6968 (+0.1961) 0.5700 (+0.1147)
0.1242 6200 0.2065 - - - - -
0.1282 6400 0.2191 - - - - -
0.1322 6600 0.2108 - - - - -
0.1362 6800 0.2067 - - - - -
0.1402 7000 0.2055 0.1554 0.6295 (+0.0891) 0.3968 (+0.0718) 0.6862 (+0.1855) 0.5708 (+0.1155)
0.1443 7200 0.1994 - - - - -
0.1483 7400 0.2067 - - - - -
0.1523 7600 0.1933 - - - - -
0.1563 7800 0.1903 - - - - -
0.1603 8000 0.1837 0.1569 0.6236 (+0.0831) 0.4196 (+0.0946) 0.6927 (+0.1920) 0.5786 (+0.1232)
0.1643 8200 0.1968 - - - - -
0.1683 8400 0.2037 - - - - -
0.1723 8600 0.2052 - - - - -
0.1763 8800 0.2007 - - - - -
0.1803 9000 0.1771 0.1642 0.6579 (+0.1175) 0.3949 (+0.0699) 0.6931 (+0.1924) 0.5820 (+0.1266)
0.1843 9200 0.1828 - - - - -
0.1883 9400 0.195 - - - - -
0.1923 9600 0.1992 - - - - -
0.1963 9800 0.1859 - - - - -
0.2004 10000 0.1934 0.1514 0.6756 (+0.1351) 0.4280 (+0.1029) 0.7235 (+0.2228) 0.6090 (+0.1536)
0.2044 10200 0.1828 - - - - -
0.2084 10400 0.1749 - - - - -
0.2124 10600 0.1908 - - - - -
0.2164 10800 0.1837 - - - - -
0.2204 11000 0.1726 0.1469 0.6427 (+0.1023) 0.4170 (+0.0920) 0.7408 (+0.2402) 0.6002 (+0.1448)
0.2244 11200 0.1922 - - - - -
0.2284 11400 0.1853 - - - - -
0.2324 11600 0.1856 - - - - -
0.2364 11800 0.1797 - - - - -
0.2404 12000 0.1631 0.1508 0.6758 (+0.1354) 0.4076 (+0.0825) 0.7316 (+0.2310) 0.6050 (+0.1496)
0.2444 12200 0.1778 - - - - -
0.2484 12400 0.174 - - - - -
0.2524 12600 0.159 - - - - -
0.2565 12800 0.1744 - - - - -
0.2605 13000 0.1828 0.1524 0.6696 (+0.1291) 0.4039 (+0.0788) 0.7001 (+0.1994) 0.5912 (+0.1358)
0.2645 13200 0.1726 - - - - -
0.2685 13400 0.1947 - - - - -
0.2725 13600 0.1697 - - - - -
0.2765 13800 0.1958 - - - - -
0.2805 14000 0.1917 0.1442 0.6612 (+0.1208) 0.4091 (+0.0841) 0.6987 (+0.1980) 0.5897 (+0.1343)
0.2845 14200 0.1863 - - - - -
0.2885 14400 0.1844 - - - - -
0.2925 14600 0.1764 - - - - -
0.2965 14800 0.1719 - - - - -
0.3005 15000 0.1844 0.1481 0.6572 (+0.1168) 0.3984 (+0.0733) 0.7382 (+0.2376) 0.5979 (+0.1426)
0.3045 15200 0.176 - - - - -
0.3085 15400 0.1724 - - - - -
0.3126 15600 0.1747 - - - - -
0.3166 15800 0.1649 - - - - -
0.3206 16000 0.1779 0.1450 0.6168 (+0.0763) 0.4096 (+0.0846) 0.7118 (+0.2112) 0.5794 (+0.1240)
0.3246 16200 0.1755 - - - - -
0.3286 16400 0.1567 - - - - -
0.3326 16600 0.1749 - - - - -
0.3366 16800 0.1827 - - - - -
0.3406 17000 0.1773 0.1394 0.6868 (+0.1464) 0.3943 (+0.0693) 0.7007 (+0.2001) 0.5940 (+0.1386)
0.3446 17200 0.1747 - - - - -
0.3486 17400 0.1805 - - - - -
0.3526 17600 0.1688 - - - - -
0.3566 17800 0.1649 - - - - -
0.3606 18000 0.1747 0.1405 0.6390 (+0.0986) 0.3952 (+0.0701) 0.7370 (+0.2364) 0.5904 (+0.1350)
0.3646 18200 0.1797 - - - - -
0.3687 18400 0.1557 - - - - -
0.3727 18600 0.1644 - - - - -
0.3767 18800 0.1701 - - - - -
0.3807 19000 0.1673 0.1433 0.6799 (+0.1395) 0.4012 (+0.0762) 0.7286 (+0.2279) 0.6032 (+0.1479)
0.3847 19200 0.1736 - - - - -
0.3887 19400 0.1767 - - - - -
0.3927 19600 0.1735 - - - - -
0.3967 19800 0.1758 - - - - -
0.4007 20000 0.1711 0.1380 0.6773 (+0.1369) 0.4149 (+0.0898) 0.7166 (+0.2159) 0.6029 (+0.1476)
0.4047 20200 0.1704 - - - - -
0.4087 20400 0.1637 - - - - -
0.4127 20600 0.1783 - - - - -
0.4167 20800 0.1585 - - - - -
0.4207 21000 0.1769 0.1399 0.6832 (+0.1428) 0.4254 (+0.1003) 0.6977 (+0.1970) 0.6021 (+0.1467)
0.4248 21200 0.1644 - - - - -
0.4288 21400 0.1693 - - - - -
0.4328 21600 0.1604 - - - - -
0.4368 21800 0.1714 - - - - -
0.4408 22000 0.1577 0.1392 0.6715 (+0.1311) 0.4199 (+0.0948) 0.7038 (+0.2032) 0.5984 (+0.1430)
0.4448 22200 0.1742 - - - - -
0.4488 22400 0.1744 - - - - -
0.4528 22600 0.1682 - - - - -
0.4568 22800 0.1597 - - - - -
0.4608 23000 0.1626 0.1364 0.6698 (+0.1294) 0.4191 (+0.0941) 0.7255 (+0.2249) 0.6048 (+0.1494)
0.4648 23200 0.1543 - - - - -
0.4688 23400 0.1571 - - - - -
0.4728 23600 0.1576 - - - - -
0.4768 23800 0.1644 - - - - -
0.4809 24000 0.1542 0.1444 0.6618 (+0.1213) 0.4095 (+0.0844) 0.7442 (+0.2436) 0.6052 (+0.1498)
0.4849 24200 0.1826 - - - - -
0.4889 24400 0.1649 - - - - -
0.4929 24600 0.154 - - - - -
0.4969 24800 0.1779 - - - - -
0.5009 25000 0.1615 0.1373 0.6506 (+0.1102) 0.3971 (+0.0721) 0.7165 (+0.2159) 0.5881 (+0.1327)
0.5049 25200 0.1558 - - - - -
0.5089 25400 0.1741 - - - - -
0.5129 25600 0.151 - - - - -
0.5169 25800 0.1654 - - - - -
0.5209 26000 0.1656 0.1368 0.6631 (+0.1226) 0.3888 (+0.0638) 0.7092 (+0.2085) 0.5870 (+0.1317)
0.5249 26200 0.1603 - - - - -
0.5289 26400 0.1547 - - - - -
0.5329 26600 0.1782 - - - - -
0.5370 26800 0.1571 - - - - -
0.5410 27000 0.1595 0.1376 0.6352 (+0.0948) 0.3960 (+0.0710) 0.7081 (+0.2074) 0.5798 (+0.1244)
0.5450 27200 0.1764 - - - - -
0.5490 27400 0.1672 - - - - -
0.5530 27600 0.1669 - - - - -
0.5570 27800 0.1719 - - - - -
0.5610 28000 0.1759 0.1355 0.6629 (+0.1225) 0.4013 (+0.0762) 0.7671 (+0.2665) 0.6104 (+0.1551)
0.5650 28200 0.1595 - - - - -
0.5690 28400 0.1558 - - - - -
0.5730 28600 0.1617 - - - - -
0.5770 28800 0.1669 - - - - -
0.5810 29000 0.1481 0.1363 0.6613 (+0.1208) 0.3961 (+0.0710) 0.7413 (+0.2406) 0.5995 (+0.1442)
0.5850 29200 0.1584 - - - - -
0.5890 29400 0.1654 - - - - -
0.5931 29600 0.1659 - - - - -
0.5971 29800 0.1653 - - - - -
0.6011 30000 0.1606 0.1368 0.6554 (+0.1150) 0.3927 (+0.0676) 0.7139 (+0.2132) 0.5873 (+0.1320)
0.6051 30200 0.1625 - - - - -
0.6091 30400 0.1581 - - - - -
0.6131 30600 0.145 - - - - -
0.6171 30800 0.1584 - - - - -
0.6211 31000 0.1566 0.1325 0.6680 (+0.1275) 0.3978 (+0.0728) 0.7372 (+0.2365) 0.6010 (+0.1456)
0.6251 31200 0.1611 - - - - -
0.6291 31400 0.1724 - - - - -
0.6331 31600 0.1609 - - - - -
0.6371 31800 0.1621 - - - - -
0.6411 32000 0.1537 0.1300 0.6615 (+0.1211) 0.4063 (+0.0813) 0.7697 (+0.2691) 0.6125 (+0.1571)
0.6451 32200 0.1641 - - - - -
0.6492 32400 0.1487 - - - - -
0.6532 32600 0.1456 - - - - -
0.6572 32800 0.1514 - - - - -
0.6612 33000 0.158 0.1309 0.6556 (+0.1152) 0.4125 (+0.0875) 0.7479 (+0.2473) 0.6053 (+0.1500)
0.6652 33200 0.1451 - - - - -
0.6692 33400 0.1495 - - - - -
0.6732 33600 0.1467 - - - - -
0.6772 33800 0.143 - - - - -
0.6812 34000 0.1639 0.1334 0.6769 (+0.1365) 0.4002 (+0.0752) 0.7420 (+0.2414) 0.6064 (+0.1510)
0.6852 34200 0.1542 - - - - -
0.6892 34400 0.1592 - - - - -
0.6932 34600 0.1452 - - - - -
0.6972 34800 0.1569 - - - - -
0.7012 35000 0.1502 0.1299 0.6648 (+0.1243) 0.3834 (+0.0583) 0.7684 (+0.2678) 0.6055 (+0.1501)
0.7053 35200 0.1564 - - - - -
0.7093 35400 0.1509 - - - - -
0.7133 35600 0.156 - - - - -
0.7173 35800 0.1547 - - - - -
0.7213 36000 0.1595 0.1297 0.6521 (+0.1117) 0.3916 (+0.0665) 0.7318 (+0.2311) 0.5918 (+0.1364)
0.7253 36200 0.1457 - - - - -
0.7293 36400 0.1615 - - - - -
0.7333 36600 0.1508 - - - - -
0.7373 36800 0.1478 - - - - -
0.7413 37000 0.1455 0.1322 0.6614 (+0.1210) 0.4132 (+0.0882) 0.7656 (+0.2650) 0.6134 (+0.1581)
0.7453 37200 0.1526 - - - - -
0.7493 37400 0.1571 - - - - -
0.7533 37600 0.141 - - - - -
0.7573 37800 0.1418 - - - - -
0.7614 38000 0.1597 0.1347 0.6707 (+0.1302) 0.4175 (+0.0925) 0.7568 (+0.2561) 0.6150 (+0.1596)
0.7654 38200 0.1512 - - - - -
0.7694 38400 0.1424 - - - - -
0.7734 38600 0.1601 - - - - -
0.7774 38800 0.13 - - - - -
0.7814 39000 0.1508 0.1322 0.6960 (+0.1556) 0.4032 (+0.0781) 0.7585 (+0.2579) 0.6192 (+0.1639)
0.7854 39200 0.1456 - - - - -
0.7894 39400 0.1502 - - - - -
0.7934 39600 0.1507 - - - - -
0.7974 39800 0.1696 - - - - -
0.8014 40000 0.1381 0.1289 0.7251 (+0.1847) 0.4143 (+0.0892) 0.7594 (+0.2587) 0.6329 (+0.1776)
0.8054 40200 0.1544 - - - - -
0.8094 40400 0.1541 - - - - -
0.8134 40600 0.1458 - - - - -
0.8175 40800 0.1411 - - - - -
0.8215 41000 0.1495 0.1280 0.7051 (+0.1646) 0.4102 (+0.0851) 0.7520 (+0.2514) 0.6224 (+0.1670)
0.8255 41200 0.1465 - - - - -
0.8295 41400 0.1577 - - - - -
0.8335 41600 0.1489 - - - - -
0.8375 41800 0.1481 - - - - -
0.8415 42000 0.148 0.1304 0.6944 (+0.1539) 0.4023 (+0.0772) 0.7440 (+0.2433) 0.6135 (+0.1582)
0.8455 42200 0.1529 - - - - -
0.8495 42400 0.1522 - - - - -
0.8535 42600 0.1455 - - - - -
0.8575 42800 0.1567 - - - - -
0.8615 43000 0.1435 0.1304 0.6710 (+0.1306) 0.4130 (+0.0880) 0.7493 (+0.2486) 0.6111 (+0.1557)
0.8655 43200 0.1426 - - - - -
0.8695 43400 0.1527 - - - - -
0.8736 43600 0.1431 - - - - -
0.8776 43800 0.1382 - - - - -
0.8816 44000 0.1554 0.1288 0.6842 (+0.1437) 0.3996 (+0.0746) 0.7535 (+0.2529) 0.6124 (+0.1571)
0.8856 44200 0.1491 - - - - -
0.8896 44400 0.1626 - - - - -
0.8936 44600 0.1471 - - - - -
0.8976 44800 0.1459 - - - - -
0.9016 45000 0.1501 0.1284 0.6995 (+0.1590) 0.4051 (+0.0801) 0.7608 (+0.2602) 0.6218 (+0.1664)
0.9056 45200 0.1513 - - - - -
0.9096 45400 0.1521 - - - - -
0.9136 45600 0.1417 - - - - -
0.9176 45800 0.1452 - - - - -
0.9216 46000 0.1591 0.1254 0.7086 (+0.1682) 0.3940 (+0.0690) 0.7567 (+0.2561) 0.6198 (+0.1644)
0.9256 46200 0.1473 - - - - -
0.9297 46400 0.1329 - - - - -
0.9337 46600 0.1523 - - - - -
0.9377 46800 0.1385 - - - - -
0.9417 47000 0.1393 0.1267 0.7161 (+0.1756) 0.3941 (+0.0690) 0.7662 (+0.2656) 0.6255 (+0.1701)
0.9457 47200 0.1421 - - - - -
0.9497 47400 0.1509 - - - - -
0.9537 47600 0.1587 - - - - -
0.9577 47800 0.1402 - - - - -
0.9617 48000 0.1355 0.1278 0.6976 (+0.1571) 0.3958 (+0.0708) 0.7538 (+0.2531) 0.6157 (+0.1603)
0.9657 48200 0.1518 - - - - -
0.9697 48400 0.1369 - - - - -
0.9737 48600 0.1475 - - - - -
0.9777 48800 0.1495 - - - - -
0.9817 49000 0.1402 0.1275 0.6973 (+0.1568) 0.3990 (+0.0740) 0.7534 (+0.2528) 0.6166 (+0.1612)
0.9858 49200 0.1527 - - - - -
0.9898 49400 0.143 - - - - -
0.9938 49600 0.1619 - - - - -
0.9978 49800 0.1422 - - - - -
-1 -1 - - 0.7251 (+0.1847) 0.4143 (+0.0892) 0.7594 (+0.2587) 0.6329 (+0.1776)
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.10
  • Sentence Transformers: 3.5.0.dev0
  • Transformers: 4.49.0
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.2.0
  • Datasets: 2.21.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

LambdaLoss

@inproceedings{wang2018lambdaloss,
  title={The lambdaloss framework for ranking metric optimization},
  author={Wang, Xuanhui and Li, Cheng and Golbandi, Nadav and Bendersky, Michael and Najork, Marc},
  booktitle={Proceedings of the 27th ACM international conference on information and knowledge management},
  pages={1313--1322},
  year={2018}
}
Downloads last month
25
Safetensors
Model size
150M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tomaarsen/reranker-msmarco-ModernBERT-base-lambdaloss

Finetuned
(453)
this model

Dataset used to train tomaarsen/reranker-msmarco-ModernBERT-base-lambdaloss

Collection including tomaarsen/reranker-msmarco-ModernBERT-base-lambdaloss

Evaluation results