diff --git "a/checkpoint-527/README.md" "b/checkpoint-527/README.md"
new file mode 100644--- /dev/null
+++ "b/checkpoint-527/README.md"
@@ -0,0 +1,1221 @@
+---
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:102000
+- loss:MatryoshkaLoss
+- loss:CachedGISTEmbedLoss
+base_model: BAAI/bge-m3
+widget:
+- source_sentence: More than 710 people have died of coronavirus in countries other
+ than Mainland China in 2020 .
+ sentences:
+ - 'More than 3,800 people have died : more than 3,100 in mainland China and around
+ 725 in all other countries .'
+ - Competition law Substance and practice of competition law varies from jurisdiction
+ to jurisdiction. Protecting the interests of consumers (consumer welfare) and
+ ensuring that entrepreneurs have an opportunity to compete in the market economy
+ are often treated as important objectives. Competition law is closely connected
+ with law on deregulation of access to markets, state aids and subsidies, the privatization
+ of state owned assets and the establishment of independent sector regulators,
+ among other market-oriented supply-side policies. In recent decades, competition
+ law has been viewed as a way to provide better public services.[10] Robert Bork
+ argued that competition laws can produce adverse effects when they reduce competition
+ by protecting inefficient competitors and when costs of legal intervention are
+ greater than benefits for the consumers.[11]
+ - Pretty Woman Casting of the film was a rather lengthy process. Marshall had initially
+ considered Christopher Reeve, Daniel Day-Lewis, and Denzel Washington for the
+ role of Edward, and Al Pacino and Burt Reynolds turned it down.[8] Pacino went
+ as far as doing a casting reading with Roberts before rejecting the part.[9] Gere
+ initially refused but when he met with Roberts, she persuaded him and he eventually
+ agreed to play Lewis.[10] He reportedly started off much more active in his role;
+ but Garry Marshall took him aside and said "No, no, no, Richard. In this movie,
+ one of you moves and one of you does not. Guess which one you are?"[11] Julia
+ Roberts was not the first choice for the role of Vivian, and was not wanted by
+ Disney. Many other actresses were considered. Marshall originally envisioned Karen
+ Allen for the role; when she declined, auditions went to many better-known actresses
+ of the time including Molly Ringwald,[12] who turned it down because she felt
+ uncomfortable playing a sex worker.[citation needed] Winona Ryder auditioned,
+ but was turned down because Marshall felt she was "too young". Jennifer Connelly
+ was also dismissed for the same reason.[4]
+- source_sentence: More than 531,500 cases of COVID-19 have been reported in over
+ 190 countries and territories after March 26 .
+ sentences:
+ - 'Adam was a regular in his first three seasons with Stoke, but has only played
+ 34 games since the start of 2015-16.
+
+ The Scot, 30, has started for Stoke in their past two games but he will not feature
+ in the World Cup qualifier against England on 12 November.
+
+ "He''s in good physical condition at the moment. That''s really encouraging for
+ him and for us," said Hughes.
+
+ "In the past, Charlie will admit himself that, in games, to keep the intensity
+ he plays at and the energy that he shows, he has found it difficult to complete
+ 90 minutes on occasions.
+
+ "It''s difficult when you''re out of the side. You have to keep your focus and
+ Charlie has been able to do that."
+
+ Adam, who has 26 caps, revealed last week that Scotland manager Gordon Strachan
+ has not spoken to him in more than a year.
+
+ After been overlooked, Adam says he is now looking no further than Stoke''s next
+ Premier League game at home to Bournemouth on 19 November.
+
+ "Football can change from one week to another," he told BBC Radio Stoke. "I''m
+ just enjoying the way I''m playing and hoping that''s good enough to keep me in
+ the team.
+
+ "If the team keep performing well, then you''ve got a good chance of staying in."
+
+ After their poor start to the season, Stoke have now climbed to 12th in the league.
+
+ Now into his fifth season at the Britannia Stadium, former Rangers, Blackpool
+ and Liverpool playmaker Adam has made a total of 130 appearances for the club,
+ but only 71 Premier League starts.
+
+ He signed a three-year contract extension with Stoke in July 2015.'
+ - Adam Surat ( `` The Inner Strength '' ) is a 1989 Bangladeshi documentary film
+ about the Bangladeshi painter Tareque Masud , directed by Sheikh Mohammed Sultan
+ .
+ - As of 27 March , more than 532,000 cases of COVID-19 have been reported in over
+ 190 countries and territories , resulting in approximately 24,000 deaths and more
+ than 123,000 recoveries .
+- source_sentence: More than 285,000 COVID-19 cases came to light from over 180 countries
+ and territories by March 21 , 2020 .
+ sentences:
+ - As of 21 March , more than 286,000 cases of COVID-19 have been reported in over
+ 180 countries and territories , resulting in more than 11,800 deaths and 93,000
+ recoveries .
+ - Dunedin and Suburbs North was a parliamentary electorate in the city of Dunedin
+ in Otago , New Zealand from 1863 to 1866 . It was a multi-member electorate .
+ - The fucoxanthin dinophyte lineages (including Karlodinium and Karenia) lost their
+ original red algal derived chloroplast, and replaced it with a new chloroplast
+ derived from a haptophyte endosymbiont.
+- source_sentence: Geoff Johns and Ben Affleck co-writing of Batman film was announced
+ in July 2015 .
+ sentences:
+ - A dual monitored computer atop a corner-fitting glass tabletop.
+ - 'The Voice (U.S. TV series) The Voice is an American reality television singing
+ competition broadcast on NBC. Based on the original The Voice of Holland, the
+ concept of the series is to find currently unsigned singing talent (solo or duets,
+ professional and amateur) contested by aspiring singers, age 15 or over (reduced
+ to 13 since season 12),[2] drawn from public auditions. The winner is determined
+ by television viewers voting by telephone, Internet, SMS text, and iTunes Store
+ purchases of the audio-recorded artists'' vocal performances. They receive US$100,000
+ and a record deal with Universal Music Group for winning the competition. The
+ winners of the twelve seasons have been: Javier Colon, Jermaine Paul, Cassadee
+ Pope, Danielle Bradbery, Tessanne Chin, Josh Kaufman, Craig Wayne Boyd, Sawyer
+ Fredericks, Jordan Smith, Alisan Porter, Sundance Head, and Chris Blue.'
+ - In July 2015 , it was announced that Johns and Ben Affleck will co-write the screenplay
+ for a standalone Batman film , starring Affleck , set in the DC Comics shared
+ film universe.
+- source_sentence: This genus is presently in the family of lizards known as Iguanidae
+ , subfamily Polychrotinae , and is no longer classified in the now invalid family
+ , Polychrotidae .
+ sentences:
+ - This genus is currently classified in the family of lizards , known as Iguanidae
+ , subfamily Polychrotinae , and is no longer classified in the now invalid family
+ , polychrotidae .
+ - 'Garden Within A Garden, by Pakistani artist Imran Qureshi, has been painted in
+ Lister Park and City Park, Bradford.
+
+ More than one million Muslim, Sikh and Hindu soldiers fought for Britain between
+ 1914 and 1918.
+
+ The work, inspired by Pakistani and Indian miniature painting, uses acrylics straight
+ on the city''s paving.
+
+ Qureshi lives in Lahore, in modern-day Pakistan, but that area as part of British
+ India enlisted a large number of soldiers to fight for Britain.
+
+ "War is already about horrific images of blood and lost lives, so I came up with
+ the idea of a garden in a garden", he said.
+
+ Councillor Sarah Ferriby said: " It is an opportune moment for us to reflect on
+ the experiences of people from across the world who fought on the side of the
+ Allies in the First World War and the invaluable contribution they made to the
+ war efforts."
+
+ India gained independence from the British in August 1947 and split into two separate
+ countries - the secular state of India dominated by Hindus and Muslim Pakistan.
+
+ Garden Within A Garden will be on display in Bradford until September.
+
+ It has been co-commissioned by 14-18 NOW, the UK''s arts programme for the World
+ War One centenary, Bradford Council and Yorkshire Festival 2016.'
+ - 'The Ulster Orchestra teamed up with Radio Ulster at the event, which was broadcast
+ live from 20:00 GMT.
+
+ Special guests included actor Simon Callow, writer Anita Robinson and singer Peter
+ Corry.
+
+ Presented by Wendy Austin and John Toal, the gala event featured performances
+ by musicians, comedians, artists and Radio Ulster presenters.
+
+ The acts included Dana Masters, Best Boy Grip and the Sands Family.
+
+ In pictures: 40 years of BBC Radio Ulster
+
+ The Hole In the Wall Gang comedy group, brought to prominence Radio Ulster''s
+ Talkback programme, also performed at the concert.
+
+ On television, the documentary, Radio Days, was broadcast from 22:35 GMT on BBC
+ One NI.
+
+ Narrated by Stephen Nolan, the programme heard from the station''s presenters
+ and listeners about the station''s legacy.
+
+ It followed loyal listeners and features rare behind the scenes archive footage.
+
+ Presenters Walter Love, Wendy Austin, Hugo Duncan and Stephen Nolan talked about
+ their time at the station.
+
+ Fergus Keeling, Head of Radio, BBC Northern Ireland, said he hoped Monday''s events
+ would be the station''s way of "giving our listeners something special back".
+
+ "They''ve joined in our birthday broadcasts, they have helped make this year special
+ and they are the reason we do what we do."
+
+ He thanked presenters and guests "for taking the time to help us celebrate in
+ this way".
+
+ "Most of all though, I''d like to thank our listeners old and new. This night
+ is for them."
+
+ BBC Director General Tony Hall said: "Congratulations to everyone who''s contributed
+ to BBC Radio Ulster over these last 40 years - whether in news, arts and drama,
+ music or sports.
+
+ "But, above all, I''d like to thank our listeners for their loyalty, their stories
+ and their support."
+
+ Broadcasting legends John Bennett and Walter Love joined the Stephen Nolan Show
+ to talk about what has changed at Radio Ulster. ‬
+
+ On technology
+
+ On practical jokes'
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+metrics:
+- pearson_cosine
+- spearman_cosine
+model-index:
+- name: SentenceTransformer based on BAAI/bge-m3
+ results:
+ - task:
+ type: semantic-similarity
+ name: Semantic Similarity
+ dataset:
+ name: sts test
+ type: sts-test
+ metrics:
+ - type: pearson_cosine
+ value: 0.9074876367668167
+ name: Pearson Cosine
+ - type: spearman_cosine
+ value: 0.9181634647690607
+ name: Spearman Cosine
+ - task:
+ type: semantic-similarity
+ name: Semantic Similarity
+ dataset:
+ name: sts test 1280
+ type: sts-test-1280
+ metrics:
+ - type: pearson_cosine
+ value: 0.9074909272280929
+ name: Pearson Cosine
+ - type: spearman_cosine
+ value: 0.9182151077247066
+ name: Spearman Cosine
+ - task:
+ type: semantic-similarity
+ name: Semantic Similarity
+ dataset:
+ name: sts test 1024
+ type: sts-test-1024
+ metrics:
+ - type: pearson_cosine
+ value: 0.9074934984822008
+ name: Pearson Cosine
+ - type: spearman_cosine
+ value: 0.9182130491666827
+ name: Spearman Cosine
+ - task:
+ type: semantic-similarity
+ name: Semantic Similarity
+ dataset:
+ name: sts test 760
+ type: sts-test-760
+ metrics:
+ - type: pearson_cosine
+ value: 0.9057267653570724
+ name: Pearson Cosine
+ - type: spearman_cosine
+ value: 0.9172522395846092
+ name: Spearman Cosine
+ - task:
+ type: semantic-similarity
+ name: Semantic Similarity
+ dataset:
+ name: sts test 512
+ type: sts-test-512
+ metrics:
+ - type: pearson_cosine
+ value: 0.9062832297154773
+ name: Pearson Cosine
+ - type: spearman_cosine
+ value: 0.9180905649642541
+ name: Spearman Cosine
+---
+
+# SentenceTransformer based on BAAI/bge-m3
+
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) on the global_dataset dataset. It maps sentences & paragraphs to a 1536-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+
+## Model Details
+
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)
+- **Maximum Sequence Length:** 8192 tokens
+- **Output Dimensionality:** 1536 dimensions
+- **Similarity Function:** Cosine Similarity
+- **Training Dataset:**
+ - global_dataset
+
+
+
+### Model Sources
+
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+
+### Full Model Architecture
+
+```
+SentenceTransformer(
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
+ (1): AdvancedWeightedPooling(
+ (mha): MultiheadAttention(
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=1024, out_features=1024, bias=True)
+ )
+ (MLP): Sequential(
+ (0): SwiGLUBlock(
+ (in_proj_w): Linear(in_features=1024, out_features=2048, bias=True)
+ (in_proj_v): Linear(in_features=1024, out_features=2048, bias=True)
+ (dropout): Dropout(p=0.05, inplace=False)
+ )
+ (1): SwiGLUBlock(
+ (in_proj_w): Linear(in_features=2048, out_features=512, bias=True)
+ (in_proj_v): Linear(in_features=2048, out_features=512, bias=True)
+ (dropout): Dropout(p=0.025, inplace=False)
+ )
+ (2): Linear(in_features=512, out_features=512, bias=True)
+ )
+ (layernorm): LayerNorm((1536,), eps=1e-05, elementwise_affine=True)
+ )
+)
+```
+
+## Usage
+
+### Direct Usage (Sentence Transformers)
+
+First install the Sentence Transformers library:
+
+```bash
+pip install -U sentence-transformers
+```
+
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+
+# Download from the 🤗 Hub
+model = SentenceTransformer("bobox/XLMRoBERTaM3-CustomPoolin-v1.04-1024conc512-MLP-s1-checkpoints-tmp")
+# Run inference
+sentences = [
+ 'This genus is presently in the family of lizards known as Iguanidae , subfamily Polychrotinae , and is no longer classified in the now invalid family , Polychrotidae .',
+ 'This genus is currently classified in the family of lizards , known as Iguanidae , subfamily Polychrotinae , and is no longer classified in the now invalid family , polychrotidae .',
+ 'The Ulster Orchestra teamed up with Radio Ulster at the event, which was broadcast live from 20:00 GMT.\nSpecial guests included actor Simon Callow, writer Anita Robinson and singer Peter Corry.\nPresented by Wendy Austin and John Toal, the gala event featured performances by musicians, comedians, artists and Radio Ulster presenters.\nThe acts included Dana Masters, Best Boy Grip and the Sands Family.\nIn pictures: 40 years of BBC Radio Ulster\nThe Hole In the Wall Gang comedy group, brought to prominence Radio Ulster\'s Talkback programme, also performed at the concert.\nOn television, the documentary, Radio Days, was broadcast from 22:35 GMT on BBC One NI.\nNarrated by Stephen Nolan, the programme heard from the station\'s presenters and listeners about the station\'s legacy.\nIt followed loyal listeners and features rare behind the scenes archive footage.\nPresenters Walter Love, Wendy Austin, Hugo Duncan and Stephen Nolan talked about their time at the station.\nFergus Keeling, Head of Radio, BBC Northern Ireland, said he hoped Monday\'s events would be the station\'s way of "giving our listeners something special back".\n"They\'ve joined in our birthday broadcasts, they have helped make this year special and they are the reason we do what we do."\nHe thanked presenters and guests "for taking the time to help us celebrate in this way".\n"Most of all though, I\'d like to thank our listeners old and new. This night is for them."\nBBC Director General Tony Hall said: "Congratulations to everyone who\'s contributed to BBC Radio Ulster over these last 40 years - whether in news, arts and drama, music or sports.\n"But, above all, I\'d like to thank our listeners for their loyalty, their stories and their support."\nBroadcasting legends John Bennett and Walter Love joined the Stephen Nolan Show to talk about what has changed at Radio Ulster. ‬\nOn technology\nOn practical jokes',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 1536]
+
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+
+
+
+
+
+
+
+## Evaluation
+
+### Metrics
+
+#### Semantic Similarity
+
+* Datasets: `sts-test`, `sts-test-1280`, `sts-test-1024`, `sts-test-760` and `sts-test-512`
+* Evaluated with [EmbeddingSimilarityEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
+
+| Metric | sts-test | sts-test-1280 | sts-test-1024 | sts-test-760 | sts-test-512 |
+|:--------------------|:-----------|:--------------|:--------------|:-------------|:-------------|
+| pearson_cosine | 0.9075 | 0.9075 | 0.9075 | 0.9057 | 0.9063 |
+| **spearman_cosine** | **0.9182** | **0.9182** | **0.9182** | **0.9173** | **0.9181** |
+
+
+
+
+
+## Training Details
+
+### Training Dataset
+
+#### global_dataset
+
+* Dataset: global_dataset
+* Size: 102,000 training samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details |
- min: 5 tokens
- mean: 22.66 tokens
- max: 76 tokens
| - min: 5 tokens
- mean: 100.77 tokens
- max: 554 tokens
|
+* Samples:
+ | sentence1 | sentence2 |
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+ | MIDDLE schools in Beccles, Bungay and Halesworth are still doomed despite news that the new two-tier system may not be implemented across the whole of Suffolk.
| Middle schools still doomed
|
+ | what event began the second front in world war ii
| Western Front (World War II) The Western Front was a military theatre of World War II encompassing Denmark, Norway, Luxembourg, Belgium, the Netherlands, the United Kingdom, France, Italy, and Germany.[33] World War II military engagements in Southern Europe and elsewhere are generally considered under separate headings. The Western Front was marked by two phases of large-scale combat operations. The first phase saw the capitulation of the Netherlands, Belgium, and France during May and June 1940 after their defeat in the Low Countries and the northern half of France, and continued into an air war between Germany and Britain that climaxed with the Battle of Britain. The second phase consisted of large-scale ground combat (supported by a massive air war considered to be an additional front), which began in June 1944 with the Allied landings in Normandy and continued until the defeat of Germany in May 1945.
|
+ | FirstEnergy Stadium naming rights belong to FirstEnergy Corporation until 2029 .
| Though naming rights belong to FirstEnergy Corporation through 2029 , the stadium itself is actually serviced by Cleveland Public Power.
|
+* Loss: [MatryoshkaLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
+ ```json
+ {
+ "loss": "CachedGISTEmbedLoss",
+ "matryoshka_dims": [
+ 1536,
+ 1280,
+ 1024,
+ 768,
+ 512
+ ],
+ "matryoshka_weights": [
+ 1,
+ 0.2,
+ 0.33,
+ 0.1,
+ 0.05
+ ],
+ "n_dims_per_step": -1
+ }
+ ```
+
+### Evaluation Dataset
+
+#### global_dataset
+
+* Dataset: global_dataset
+* Size: 972 evaluation samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 972 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | - min: 6 tokens
- mean: 22.52 tokens
- max: 56 tokens
| - min: 7 tokens
- mean: 97.69 tokens
- max: 638 tokens
|
+* Samples:
+ | sentence1 | sentence2 |
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+ | What was the name of the imperialistic policy in China?
| The Age of Imperialism, a time period beginning around 1700, saw (generally European) industrializing nations engaging in the process of colonizing, influencing, and annexing other parts of the world in order to gain political power.[citation needed]
|
+ | who sang the song new york new york first
| Theme from New York, New York "Theme from New York, New York" (or "New York, New York") is the theme song from the Martin Scorsese film New York, New York (1977), composed by John Kander, with lyrics by Fred Ebb. It was written for and performed in the film by Liza Minnelli. It remains one of the best-known songs about New York City. In 2004 it finished #31 on AFI's 100 Years...100 Songs survey of top tunes in American cinema.
|
+ | In 2012 , Ned Evett released `` Treehouse '' , his sixth solo record , produced in Nashville Tennessee by musician Adrian Belew .
| In 2012 , Ned Evett `` Treehouse '' released his sixth solo record , produced in Nashville Tennessee by musician Adrian Belew .
|
+* Loss: [MatryoshkaLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
+ ```json
+ {
+ "loss": "CachedGISTEmbedLoss",
+ "matryoshka_dims": [
+ 1536,
+ 1280,
+ 1024,
+ 768,
+ 512
+ ],
+ "matryoshka_weights": [
+ 1,
+ 0.2,
+ 0.33,
+ 0.1,
+ 0.05
+ ],
+ "n_dims_per_step": -1
+ }
+ ```
+
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 192
+- `per_device_eval_batch_size`: 256
+- `learning_rate`: 0.0001
+- `weight_decay`: 0.001
+- `lr_scheduler_type`: cosine_with_min_lr
+- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 2.5e-05}
+- `warmup_ratio`: 0.25
+- `save_safetensors`: False
+- `fp16`: True
+- `remove_unused_columns`: False
+- `push_to_hub`: True
+- `hub_model_id`: bobox/XLMRoBERTaM3-CustomPoolin-v1.04-1024conc512-MLP-s1-checkpoints-tmp
+- `hub_strategy`: all_checkpoints
+- `hub_private_repo`: False
+- `batch_sampler`: no_duplicates
+
+#### All Hyperparameters
+Click to expand
+
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 192
+- `per_device_eval_batch_size`: 256
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 0.0001
+- `weight_decay`: 0.001
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 3
+- `max_steps`: -1
+- `lr_scheduler_type`: cosine_with_min_lr
+- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 2.5e-05}
+- `warmup_ratio`: 0.25
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: False
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: True
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: False
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `tp_size`: 0
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: True
+- `resume_from_checkpoint`: None
+- `hub_model_id`: bobox/XLMRoBERTaM3-CustomPoolin-v1.04-1024conc512-MLP-s1-checkpoints-tmp
+- `hub_strategy`: all_checkpoints
+- `hub_private_repo`: False
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+
+
+
+### Training Logs
+Click to expand
+
+| Epoch | Step | Training Loss | global dataset loss | sts-test_spearman_cosine | sts-test-1280_spearman_cosine | sts-test-1024_spearman_cosine | sts-test-760_spearman_cosine | sts-test-512_spearman_cosine |
+|:------:|:----:|:-------------:|:-------------------:|:------------------------:|:-----------------------------:|:-----------------------------:|:----------------------------:|:----------------------------:|
+| -1 | -1 | - | - | 0.9136 | 0.9136 | 0.9136 | 0.9131 | 0.9124 |
+| 0.0019 | 1 | 1.7608 | - | - | - | - | - | - |
+| 0.0038 | 2 | 1.7877 | - | - | - | - | - | - |
+| 0.0056 | 3 | 2.2507 | - | - | - | - | - | - |
+| 0.0075 | 4 | 1.5007 | - | - | - | - | - | - |
+| 0.0094 | 5 | 1.9385 | - | - | - | - | - | - |
+| 0.0113 | 6 | 2.2608 | - | - | - | - | - | - |
+| 0.0132 | 7 | 1.792 | - | - | - | - | - | - |
+| 0.0150 | 8 | 1.9776 | - | - | - | - | - | - |
+| 0.0169 | 9 | 1.139 | - | - | - | - | - | - |
+| 0.0188 | 10 | 1.5296 | - | - | - | - | - | - |
+| 0.0207 | 11 | 1.1633 | - | - | - | - | - | - |
+| 0.0226 | 12 | 2.0384 | - | - | - | - | - | - |
+| 0.0244 | 13 | 1.395 | - | - | - | - | - | - |
+| 0.0263 | 14 | 1.7397 | - | - | - | - | - | - |
+| 0.0282 | 15 | 1.4049 | - | - | - | - | - | - |
+| 0.0301 | 16 | 1.2005 | - | - | - | - | - | - |
+| 0.0320 | 17 | 1.441 | - | - | - | - | - | - |
+| 0.0338 | 18 | 1.3119 | - | - | - | - | - | - |
+| 0.0357 | 19 | 0.8352 | - | - | - | - | - | - |
+| 0.0376 | 20 | 1.3154 | - | - | - | - | - | - |
+| 0.0395 | 21 | 1.0206 | - | - | - | - | - | - |
+| 0.0414 | 22 | 0.9626 | - | - | - | - | - | - |
+| 0.0432 | 23 | 1.3082 | - | - | - | - | - | - |
+| 0.0451 | 24 | 1.0918 | - | - | - | - | - | - |
+| 0.0470 | 25 | 1.3777 | - | - | - | - | - | - |
+| 0.0489 | 26 | 0.907 | - | - | - | - | - | - |
+| 0.0508 | 27 | 0.9302 | - | - | - | - | - | - |
+| 0.0526 | 28 | 1.0028 | - | - | - | - | - | - |
+| 0.0545 | 29 | 0.9131 | - | - | - | - | - | - |
+| 0.0564 | 30 | 1.3304 | - | - | - | - | - | - |
+| 0.0583 | 31 | 1.0405 | - | - | - | - | - | - |
+| 0.0602 | 32 | 0.6233 | - | - | - | - | - | - |
+| 0.0620 | 33 | 1.4009 | - | - | - | - | - | - |
+| 0.0639 | 34 | 0.7543 | - | - | - | - | - | - |
+| 0.0658 | 35 | 0.5975 | - | - | - | - | - | - |
+| 0.0677 | 36 | 0.803 | - | - | - | - | - | - |
+| 0.0695 | 37 | 0.7285 | - | - | - | - | - | - |
+| 0.0714 | 38 | 0.759 | - | - | - | - | - | - |
+| 0.0733 | 39 | 1.0653 | - | - | - | - | - | - |
+| 0.0752 | 40 | 0.8235 | - | - | - | - | - | - |
+| 0.0771 | 41 | 0.7822 | - | - | - | - | - | - |
+| 0.0789 | 42 | 0.539 | - | - | - | - | - | - |
+| 0.0808 | 43 | 0.9211 | - | - | - | - | - | - |
+| 0.0827 | 44 | 0.6063 | - | - | - | - | - | - |
+| 0.0846 | 45 | 0.8769 | - | - | - | - | - | - |
+| 0.0865 | 46 | 0.8044 | - | - | - | - | - | - |
+| 0.0883 | 47 | 1.0656 | - | - | - | - | - | - |
+| 0.0902 | 48 | 0.6475 | - | - | - | - | - | - |
+| 0.0921 | 49 | 0.7331 | - | - | - | - | - | - |
+| 0.0940 | 50 | 0.4248 | 1.4762 | 0.9133 | 0.9133 | 0.9134 | 0.9130 | 0.9118 |
+| 0.0959 | 51 | 0.3661 | - | - | - | - | - | - |
+| 0.0977 | 52 | 0.645 | - | - | - | - | - | - |
+| 0.0996 | 53 | 0.3991 | - | - | - | - | - | - |
+| 0.1015 | 54 | 0.8027 | - | - | - | - | - | - |
+| 0.1034 | 55 | 0.5594 | - | - | - | - | - | - |
+| 0.1053 | 56 | 0.973 | - | - | - | - | - | - |
+| 0.1071 | 57 | 0.9 | - | - | - | - | - | - |
+| 0.1090 | 58 | 0.4526 | - | - | - | - | - | - |
+| 0.1109 | 59 | 0.3216 | - | - | - | - | - | - |
+| 0.1128 | 60 | 0.6491 | - | - | - | - | - | - |
+| 0.1147 | 61 | 0.6211 | - | - | - | - | - | - |
+| 0.1165 | 62 | 0.4682 | - | - | - | - | - | - |
+| 0.1184 | 63 | 0.5099 | - | - | - | - | - | - |
+| 0.1203 | 64 | 0.5467 | - | - | - | - | - | - |
+| 0.1222 | 65 | 0.4413 | - | - | - | - | - | - |
+| 0.1241 | 66 | 0.3663 | - | - | - | - | - | - |
+| 0.1259 | 67 | 0.6832 | - | - | - | - | - | - |
+| 0.1278 | 68 | 0.3447 | - | - | - | - | - | - |
+| 0.1297 | 69 | 0.8614 | - | - | - | - | - | - |
+| 0.1316 | 70 | 0.4724 | - | - | - | - | - | - |
+| 0.1335 | 71 | 0.5842 | - | - | - | - | - | - |
+| 0.1353 | 72 | 0.4599 | - | - | - | - | - | - |
+| 0.1372 | 73 | 0.5251 | - | - | - | - | - | - |
+| 0.1391 | 74 | 0.2282 | - | - | - | - | - | - |
+| 0.1410 | 75 | 0.5728 | - | - | - | - | - | - |
+| 0.1429 | 76 | 0.4518 | - | - | - | - | - | - |
+| 0.1447 | 77 | 0.4483 | - | - | - | - | - | - |
+| 0.1466 | 78 | 0.5031 | - | - | - | - | - | - |
+| 0.1485 | 79 | 0.5342 | - | - | - | - | - | - |
+| 0.1504 | 80 | 0.2656 | - | - | - | - | - | - |
+| 0.1523 | 81 | 0.4979 | - | - | - | - | - | - |
+| 0.1541 | 82 | 0.2907 | - | - | - | - | - | - |
+| 0.1560 | 83 | 0.4795 | - | - | - | - | - | - |
+| 0.1579 | 84 | 0.3756 | - | - | - | - | - | - |
+| 0.1598 | 85 | 0.4711 | - | - | - | - | - | - |
+| 0.1617 | 86 | 0.4183 | - | - | - | - | - | - |
+| 0.1635 | 87 | 0.4993 | - | - | - | - | - | - |
+| 0.1654 | 88 | 0.4767 | - | - | - | - | - | - |
+| 0.1673 | 89 | 0.7443 | - | - | - | - | - | - |
+| 0.1692 | 90 | 0.301 | - | - | - | - | - | - |
+| 0.1711 | 91 | 0.2712 | - | - | - | - | - | - |
+| 0.1729 | 92 | 0.4745 | - | - | - | - | - | - |
+| 0.1748 | 93 | 0.3506 | - | - | - | - | - | - |
+| 0.1767 | 94 | 0.5394 | - | - | - | - | - | - |
+| 0.1786 | 95 | 0.2925 | - | - | - | - | - | - |
+| 0.1805 | 96 | 0.2154 | - | - | - | - | - | - |
+| 0.1823 | 97 | 0.468 | - | - | - | - | - | - |
+| 0.1842 | 98 | 0.2269 | - | - | - | - | - | - |
+| 0.1861 | 99 | 0.3967 | - | - | - | - | - | - |
+| 0.1880 | 100 | 0.489 | 1.2233 | 0.9141 | 0.9141 | 0.9141 | 0.9137 | 0.9123 |
+| 0.1898 | 101 | 0.3021 | - | - | - | - | - | - |
+| 0.1917 | 102 | 0.315 | - | - | - | - | - | - |
+| 0.1936 | 103 | 0.664 | - | - | - | - | - | - |
+| 0.1955 | 104 | 0.5144 | - | - | - | - | - | - |
+| 0.1974 | 105 | 0.5137 | - | - | - | - | - | - |
+| 0.1992 | 106 | 0.2783 | - | - | - | - | - | - |
+| 0.2011 | 107 | 0.2859 | - | - | - | - | - | - |
+| 0.2030 | 108 | 0.333 | - | - | - | - | - | - |
+| 0.2049 | 109 | 0.3578 | - | - | - | - | - | - |
+| 0.2068 | 110 | 0.373 | - | - | - | - | - | - |
+| 0.2086 | 111 | 0.3707 | - | - | - | - | - | - |
+| 0.2105 | 112 | 0.2798 | - | - | - | - | - | - |
+| 0.2124 | 113 | 0.3597 | - | - | - | - | - | - |
+| 0.2143 | 114 | 0.43 | - | - | - | - | - | - |
+| 0.2162 | 115 | 0.3277 | - | - | - | - | - | - |
+| 0.2180 | 116 | 0.5529 | - | - | - | - | - | - |
+| 0.2199 | 117 | 0.3227 | - | - | - | - | - | - |
+| 0.2218 | 118 | 0.6035 | - | - | - | - | - | - |
+| 0.2237 | 119 | 0.2348 | - | - | - | - | - | - |
+| 0.2256 | 120 | 0.5626 | - | - | - | - | - | - |
+| 0.2274 | 121 | 0.3628 | - | - | - | - | - | - |
+| 0.2293 | 122 | 0.4222 | - | - | - | - | - | - |
+| 0.2312 | 123 | 0.3231 | - | - | - | - | - | - |
+| 0.2331 | 124 | 0.1875 | - | - | - | - | - | - |
+| 0.2350 | 125 | 0.2226 | - | - | - | - | - | - |
+| 0.2368 | 126 | 0.318 | - | - | - | - | - | - |
+| 0.2387 | 127 | 0.4381 | - | - | - | - | - | - |
+| 0.2406 | 128 | 0.3985 | - | - | - | - | - | - |
+| 0.2425 | 129 | 0.3571 | - | - | - | - | - | - |
+| 0.2444 | 130 | 0.2185 | - | - | - | - | - | - |
+| 0.2462 | 131 | 0.4206 | - | - | - | - | - | - |
+| 0.2481 | 132 | 0.5639 | - | - | - | - | - | - |
+| 0.25 | 133 | 0.4593 | - | - | - | - | - | - |
+| 0.2519 | 134 | 0.392 | - | - | - | - | - | - |
+| 0.2538 | 135 | 0.4681 | - | - | - | - | - | - |
+| 0.2556 | 136 | 0.2313 | - | - | - | - | - | - |
+| 0.2575 | 137 | 0.2191 | - | - | - | - | - | - |
+| 0.2594 | 138 | 0.405 | - | - | - | - | - | - |
+| 0.2613 | 139 | 0.4579 | - | - | - | - | - | - |
+| 0.2632 | 140 | 0.2927 | - | - | - | - | - | - |
+| 0.2650 | 141 | 0.2333 | - | - | - | - | - | - |
+| 0.2669 | 142 | 0.2328 | - | - | - | - | - | - |
+| 0.2688 | 143 | 0.1589 | - | - | - | - | - | - |
+| 0.2707 | 144 | 0.3064 | - | - | - | - | - | - |
+| 0.2726 | 145 | 0.3051 | - | - | - | - | - | - |
+| 0.2744 | 146 | 0.2781 | - | - | - | - | - | - |
+| 0.2763 | 147 | 0.2371 | - | - | - | - | - | - |
+| 0.2782 | 148 | 0.3233 | - | - | - | - | - | - |
+| 0.2801 | 149 | 0.2306 | - | - | - | - | - | - |
+| 0.2820 | 150 | 0.2543 | 1.1359 | 0.9145 | 0.9145 | 0.9146 | 0.9134 | 0.9124 |
+| 0.2838 | 151 | 0.232 | - | - | - | - | - | - |
+| 0.2857 | 152 | 0.2088 | - | - | - | - | - | - |
+| 0.2876 | 153 | 0.43 | - | - | - | - | - | - |
+| 0.2895 | 154 | 0.2591 | - | - | - | - | - | - |
+| 0.2914 | 155 | 0.374 | - | - | - | - | - | - |
+| 0.2932 | 156 | 0.3955 | - | - | - | - | - | - |
+| 0.2951 | 157 | 0.2377 | - | - | - | - | - | - |
+| 0.2970 | 158 | 0.3472 | - | - | - | - | - | - |
+| 0.2989 | 159 | 0.2649 | - | - | - | - | - | - |
+| 0.3008 | 160 | 0.3457 | - | - | - | - | - | - |
+| 0.3026 | 161 | 0.3089 | - | - | - | - | - | - |
+| 0.3045 | 162 | 0.301 | - | - | - | - | - | - |
+| 0.3064 | 163 | 0.3386 | - | - | - | - | - | - |
+| 0.3083 | 164 | 0.458 | - | - | - | - | - | - |
+| 0.3102 | 165 | 0.3676 | - | - | - | - | - | - |
+| 0.3120 | 166 | 0.5165 | - | - | - | - | - | - |
+| 0.3139 | 167 | 0.2245 | - | - | - | - | - | - |
+| 0.3158 | 168 | 0.2643 | - | - | - | - | - | - |
+| 0.3177 | 169 | 0.4889 | - | - | - | - | - | - |
+| 0.3195 | 170 | 0.2034 | - | - | - | - | - | - |
+| 0.3214 | 171 | 0.4686 | - | - | - | - | - | - |
+| 0.3233 | 172 | 0.2751 | - | - | - | - | - | - |
+| 0.3252 | 173 | 0.3089 | - | - | - | - | - | - |
+| 0.3271 | 174 | 0.2034 | - | - | - | - | - | - |
+| 0.3289 | 175 | 0.4197 | - | - | - | - | - | - |
+| 0.3308 | 176 | 0.2756 | - | - | - | - | - | - |
+| 0.3327 | 177 | 0.2734 | - | - | - | - | - | - |
+| 0.3346 | 178 | 0.169 | - | - | - | - | - | - |
+| 0.3365 | 179 | 0.2378 | - | - | - | - | - | - |
+| 0.3383 | 180 | 0.207 | - | - | - | - | - | - |
+| 0.3402 | 181 | 0.1922 | - | - | - | - | - | - |
+| 0.3421 | 182 | 0.2401 | - | - | - | - | - | - |
+| 0.3440 | 183 | 0.2093 | - | - | - | - | - | - |
+| 0.3459 | 184 | 0.1656 | - | - | - | - | - | - |
+| 0.3477 | 185 | 0.3097 | - | - | - | - | - | - |
+| 0.3496 | 186 | 0.2157 | - | - | - | - | - | - |
+| 0.3515 | 187 | 0.2462 | - | - | - | - | - | - |
+| 0.3534 | 188 | 0.1129 | - | - | - | - | - | - |
+| 0.3553 | 189 | 0.2231 | - | - | - | - | - | - |
+| 0.3571 | 190 | 0.2683 | - | - | - | - | - | - |
+| 0.3590 | 191 | 0.0246 | - | - | - | - | - | - |
+| 0.3609 | 192 | 0.27 | - | - | - | - | - | - |
+| 0.3628 | 193 | 0.3308 | - | - | - | - | - | - |
+| 0.3647 | 194 | 0.28 | - | - | - | - | - | - |
+| 0.3665 | 195 | 0.3338 | - | - | - | - | - | - |
+| 0.3684 | 196 | 0.1966 | - | - | - | - | - | - |
+| 0.3703 | 197 | 0.1798 | - | - | - | - | - | - |
+| 0.3722 | 198 | 0.2979 | - | - | - | - | - | - |
+| 0.3741 | 199 | 0.3221 | - | - | - | - | - | - |
+| 0.3759 | 200 | 0.6034 | 1.0839 | 0.9159 | 0.9159 | 0.9159 | 0.9146 | 0.9138 |
+| 0.3778 | 201 | 0.2707 | - | - | - | - | - | - |
+| 0.3797 | 202 | 0.288 | - | - | - | - | - | - |
+| 0.3816 | 203 | 0.2101 | - | - | - | - | - | - |
+| 0.3835 | 204 | 0.4055 | - | - | - | - | - | - |
+| 0.3853 | 205 | 0.3662 | - | - | - | - | - | - |
+| 0.3872 | 206 | 0.2623 | - | - | - | - | - | - |
+| 0.3891 | 207 | 0.1804 | - | - | - | - | - | - |
+| 0.3910 | 208 | 0.21 | - | - | - | - | - | - |
+| 0.3929 | 209 | 0.5188 | - | - | - | - | - | - |
+| 0.3947 | 210 | 0.2961 | - | - | - | - | - | - |
+| 0.3966 | 211 | 0.212 | - | - | - | - | - | - |
+| 0.3985 | 212 | 0.2593 | - | - | - | - | - | - |
+| 0.4004 | 213 | 0.2851 | - | - | - | - | - | - |
+| 0.4023 | 214 | 0.21 | - | - | - | - | - | - |
+| 0.4041 | 215 | 0.206 | - | - | - | - | - | - |
+| 0.4060 | 216 | 0.4391 | - | - | - | - | - | - |
+| 0.4079 | 217 | 0.2652 | - | - | - | - | - | - |
+| 0.4098 | 218 | 0.073 | - | - | - | - | - | - |
+| 0.4117 | 219 | 0.4636 | - | - | - | - | - | - |
+| 0.4135 | 220 | 0.4002 | - | - | - | - | - | - |
+| 0.4154 | 221 | 0.3869 | - | - | - | - | - | - |
+| 0.4173 | 222 | 0.2313 | - | - | - | - | - | - |
+| 0.4192 | 223 | 0.177 | - | - | - | - | - | - |
+| 0.4211 | 224 | 0.2246 | - | - | - | - | - | - |
+| 0.4229 | 225 | 0.2082 | - | - | - | - | - | - |
+| 0.4248 | 226 | 0.3497 | - | - | - | - | - | - |
+| 0.4267 | 227 | 0.1367 | - | - | - | - | - | - |
+| 0.4286 | 228 | 0.2292 | - | - | - | - | - | - |
+| 0.4305 | 229 | 0.1934 | - | - | - | - | - | - |
+| 0.4323 | 230 | 0.1817 | - | - | - | - | - | - |
+| 0.4342 | 231 | 0.2364 | - | - | - | - | - | - |
+| 0.4361 | 232 | 0.1361 | - | - | - | - | - | - |
+| 0.4380 | 233 | 0.2478 | - | - | - | - | - | - |
+| 0.4398 | 234 | 0.3088 | - | - | - | - | - | - |
+| 0.4417 | 235 | 0.2762 | - | - | - | - | - | - |
+| 0.4436 | 236 | 0.1596 | - | - | - | - | - | - |
+| 0.4455 | 237 | 0.4028 | - | - | - | - | - | - |
+| 0.4474 | 238 | 0.2385 | - | - | - | - | - | - |
+| 0.4492 | 239 | 0.1096 | - | - | - | - | - | - |
+| 0.4511 | 240 | 0.2783 | - | - | - | - | - | - |
+| 0.4530 | 241 | 0.2536 | - | - | - | - | - | - |
+| 0.4549 | 242 | 0.132 | - | - | - | - | - | - |
+| 0.4568 | 243 | 0.1748 | - | - | - | - | - | - |
+| 0.4586 | 244 | 0.0997 | - | - | - | - | - | - |
+| 0.4605 | 245 | 0.2786 | - | - | - | - | - | - |
+| 0.4624 | 246 | 0.2071 | - | - | - | - | - | - |
+| 0.4643 | 247 | 0.1845 | - | - | - | - | - | - |
+| 0.4662 | 248 | 0.1302 | - | - | - | - | - | - |
+| 0.4680 | 249 | 0.3023 | - | - | - | - | - | - |
+| 0.4699 | 250 | 0.1952 | 1.0790 | 0.9154 | 0.9154 | 0.9154 | 0.9141 | 0.9136 |
+| 0.4718 | 251 | 0.2147 | - | - | - | - | - | - |
+| 0.4737 | 252 | 0.2907 | - | - | - | - | - | - |
+| 0.4756 | 253 | 0.204 | - | - | - | - | - | - |
+| 0.4774 | 254 | 0.2603 | - | - | - | - | - | - |
+| 0.4793 | 255 | 0.2308 | - | - | - | - | - | - |
+| 0.4812 | 256 | 0.173 | - | - | - | - | - | - |
+| 0.4831 | 257 | 0.2796 | - | - | - | - | - | - |
+| 0.4850 | 258 | 0.1085 | - | - | - | - | - | - |
+| 0.4868 | 259 | 0.2431 | - | - | - | - | - | - |
+| 0.4887 | 260 | 0.2521 | - | - | - | - | - | - |
+| 0.4906 | 261 | 0.3279 | - | - | - | - | - | - |
+| 0.4925 | 262 | 0.3679 | - | - | - | - | - | - |
+| 0.4944 | 263 | 0.1284 | - | - | - | - | - | - |
+| 0.4962 | 264 | 0.3286 | - | - | - | - | - | - |
+| 0.4981 | 265 | 0.3751 | - | - | - | - | - | - |
+| 0.5 | 266 | 0.3392 | - | - | - | - | - | - |
+| 0.5019 | 267 | 0.1515 | - | - | - | - | - | - |
+| 0.5038 | 268 | 0.2974 | - | - | - | - | - | - |
+| 0.5056 | 269 | 0.2106 | - | - | - | - | - | - |
+| 0.5075 | 270 | 0.1307 | - | - | - | - | - | - |
+| 0.5094 | 271 | 0.3075 | - | - | - | - | - | - |
+| 0.5113 | 272 | 0.3512 | - | - | - | - | - | - |
+| 0.5132 | 273 | 0.1349 | - | - | - | - | - | - |
+| 0.5150 | 274 | 0.1833 | - | - | - | - | - | - |
+| 0.5169 | 275 | 0.2363 | - | - | - | - | - | - |
+| 0.5188 | 276 | 0.3437 | - | - | - | - | - | - |
+| 0.5207 | 277 | 0.2152 | - | - | - | - | - | - |
+| 0.5226 | 278 | 0.2306 | - | - | - | - | - | - |
+| 0.5244 | 279 | 0.1523 | - | - | - | - | - | - |
+| 0.5263 | 280 | 0.2025 | - | - | - | - | - | - |
+| 0.5282 | 281 | 0.2563 | - | - | - | - | - | - |
+| 0.5301 | 282 | 0.1861 | - | - | - | - | - | - |
+| 0.5320 | 283 | 0.1602 | - | - | - | - | - | - |
+| 0.5338 | 284 | 0.2251 | - | - | - | - | - | - |
+| 0.5357 | 285 | 0.2004 | - | - | - | - | - | - |
+| 0.5376 | 286 | 0.2024 | - | - | - | - | - | - |
+| 0.5395 | 287 | 0.1639 | - | - | - | - | - | - |
+| 0.5414 | 288 | 0.205 | - | - | - | - | - | - |
+| 0.5432 | 289 | 0.2216 | - | - | - | - | - | - |
+| 0.5451 | 290 | 0.2815 | - | - | - | - | - | - |
+| 0.5470 | 291 | 0.2416 | - | - | - | - | - | - |
+| 0.5489 | 292 | 0.3183 | - | - | - | - | - | - |
+| 0.5508 | 293 | 0.3881 | - | - | - | - | - | - |
+| 0.5526 | 294 | 0.1166 | - | - | - | - | - | - |
+| 0.5545 | 295 | 0.1939 | - | - | - | - | - | - |
+| 0.5564 | 296 | 0.1113 | - | - | - | - | - | - |
+| 0.5583 | 297 | 0.2423 | - | - | - | - | - | - |
+| 0.5602 | 298 | 0.2569 | - | - | - | - | - | - |
+| 0.5620 | 299 | 0.3817 | - | - | - | - | - | - |
+| 0.5639 | 300 | 0.1794 | 1.0347 | 0.9162 | 0.9162 | 0.9162 | 0.9149 | 0.9144 |
+| 0.5658 | 301 | 0.207 | - | - | - | - | - | - |
+| 0.5677 | 302 | 0.28 | - | - | - | - | - | - |
+| 0.5695 | 303 | 0.2256 | - | - | - | - | - | - |
+| 0.5714 | 304 | 0.1659 | - | - | - | - | - | - |
+| 0.5733 | 305 | 0.1587 | - | - | - | - | - | - |
+| 0.5752 | 306 | 0.4479 | - | - | - | - | - | - |
+| 0.5771 | 307 | 0.1649 | - | - | - | - | - | - |
+| 0.5789 | 308 | 0.402 | - | - | - | - | - | - |
+| 0.5808 | 309 | 0.3003 | - | - | - | - | - | - |
+| 0.5827 | 310 | 0.1697 | - | - | - | - | - | - |
+| 0.5846 | 311 | 0.1789 | - | - | - | - | - | - |
+| 0.5865 | 312 | 0.3012 | - | - | - | - | - | - |
+| 0.5883 | 313 | 0.1306 | - | - | - | - | - | - |
+| 0.5902 | 314 | 0.2429 | - | - | - | - | - | - |
+| 0.5921 | 315 | 0.2456 | - | - | - | - | - | - |
+| 0.5940 | 316 | 0.2612 | - | - | - | - | - | - |
+| 0.5959 | 317 | 0.071 | - | - | - | - | - | - |
+| 0.5977 | 318 | 0.1342 | - | - | - | - | - | - |
+| 0.5996 | 319 | 0.1107 | - | - | - | - | - | - |
+| 0.6015 | 320 | 0.1375 | - | - | - | - | - | - |
+| 0.6034 | 321 | 0.1394 | - | - | - | - | - | - |
+| 0.6053 | 322 | 0.2689 | - | - | - | - | - | - |
+| 0.6071 | 323 | 0.2019 | - | - | - | - | - | - |
+| 0.6090 | 324 | 0.247 | - | - | - | - | - | - |
+| 0.6109 | 325 | 0.0957 | - | - | - | - | - | - |
+| 0.6128 | 326 | 0.2257 | - | - | - | - | - | - |
+| 0.6147 | 327 | 0.2134 | - | - | - | - | - | - |
+| 0.6165 | 328 | 0.2157 | - | - | - | - | - | - |
+| 0.6184 | 329 | 0.2729 | - | - | - | - | - | - |
+| 0.6203 | 330 | 0.1582 | - | - | - | - | - | - |
+| 0.6222 | 331 | 0.1599 | - | - | - | - | - | - |
+| 0.6241 | 332 | 0.216 | - | - | - | - | - | - |
+| 0.6259 | 333 | 0.1367 | - | - | - | - | - | - |
+| 0.6278 | 334 | 0.2675 | - | - | - | - | - | - |
+| 0.6297 | 335 | 0.3074 | - | - | - | - | - | - |
+| 0.6316 | 336 | 0.1689 | - | - | - | - | - | - |
+| 0.6335 | 337 | 0.2549 | - | - | - | - | - | - |
+| 0.6353 | 338 | 0.1448 | - | - | - | - | - | - |
+| 0.6372 | 339 | 0.2533 | - | - | - | - | - | - |
+| 0.6391 | 340 | 0.3232 | - | - | - | - | - | - |
+| 0.6410 | 341 | 0.1825 | - | - | - | - | - | - |
+| 0.6429 | 342 | 0.2873 | - | - | - | - | - | - |
+| 0.6447 | 343 | 0.2546 | - | - | - | - | - | - |
+| 0.6466 | 344 | 0.2048 | - | - | - | - | - | - |
+| 0.6485 | 345 | 0.2674 | - | - | - | - | - | - |
+| 0.6504 | 346 | 0.1629 | - | - | - | - | - | - |
+| 0.6523 | 347 | 0.1747 | - | - | - | - | - | - |
+| 0.6541 | 348 | 0.1784 | - | - | - | - | - | - |
+| 0.6560 | 349 | 0.2269 | - | - | - | - | - | - |
+| 0.6579 | 350 | 0.4473 | 1.0552 | 0.9181 | 0.9181 | 0.9181 | 0.9173 | 0.9164 |
+| 0.6598 | 351 | 0.1349 | - | - | - | - | - | - |
+| 0.6617 | 352 | 0.2307 | - | - | - | - | - | - |
+| 0.6635 | 353 | 0.3436 | - | - | - | - | - | - |
+| 0.6654 | 354 | 0.4285 | - | - | - | - | - | - |
+| 0.6673 | 355 | 0.2067 | - | - | - | - | - | - |
+| 0.6692 | 356 | 0.3689 | - | - | - | - | - | - |
+| 0.6711 | 357 | 0.267 | - | - | - | - | - | - |
+| 0.6729 | 358 | 0.0947 | - | - | - | - | - | - |
+| 0.6748 | 359 | 0.1395 | - | - | - | - | - | - |
+| 0.6767 | 360 | 0.0728 | - | - | - | - | - | - |
+| 0.6786 | 361 | 0.3466 | - | - | - | - | - | - |
+| 0.6805 | 362 | 0.118 | - | - | - | - | - | - |
+| 0.6823 | 363 | 0.2302 | - | - | - | - | - | - |
+| 0.6842 | 364 | 0.1604 | - | - | - | - | - | - |
+| 0.6861 | 365 | 0.2416 | - | - | - | - | - | - |
+| 0.6880 | 366 | 0.3026 | - | - | - | - | - | - |
+| 0.6898 | 367 | 0.205 | - | - | - | - | - | - |
+| 0.6917 | 368 | 0.2291 | - | - | - | - | - | - |
+| 0.6936 | 369 | 0.3908 | - | - | - | - | - | - |
+| 0.6955 | 370 | 0.2343 | - | - | - | - | - | - |
+| 0.6974 | 371 | 0.2384 | - | - | - | - | - | - |
+| 0.6992 | 372 | 0.304 | - | - | - | - | - | - |
+| 0.7011 | 373 | 0.1508 | - | - | - | - | - | - |
+| 0.7030 | 374 | 0.1184 | - | - | - | - | - | - |
+| 0.7049 | 375 | 0.2863 | - | - | - | - | - | - |
+| 0.7068 | 376 | 0.243 | - | - | - | - | - | - |
+| 0.7086 | 377 | 0.2347 | - | - | - | - | - | - |
+| 0.7105 | 378 | 0.2225 | - | - | - | - | - | - |
+| 0.7124 | 379 | 0.1221 | - | - | - | - | - | - |
+| 0.7143 | 380 | 0.0915 | - | - | - | - | - | - |
+| 0.7162 | 381 | 0.2929 | - | - | - | - | - | - |
+| 0.7180 | 382 | 0.1045 | - | - | - | - | - | - |
+| 0.7199 | 383 | 0.2764 | - | - | - | - | - | - |
+| 0.7218 | 384 | 0.1787 | - | - | - | - | - | - |
+| 0.7237 | 385 | 0.3038 | - | - | - | - | - | - |
+| 0.7256 | 386 | 0.1276 | - | - | - | - | - | - |
+| 0.7274 | 387 | 0.318 | - | - | - | - | - | - |
+| 0.7293 | 388 | 0.1114 | - | - | - | - | - | - |
+| 0.7312 | 389 | 0.0779 | - | - | - | - | - | - |
+| 0.7331 | 390 | 0.1246 | - | - | - | - | - | - |
+| 0.7350 | 391 | 0.1865 | - | - | - | - | - | - |
+| 0.7368 | 392 | 0.1603 | - | - | - | - | - | - |
+| 0.7387 | 393 | 0.3891 | - | - | - | - | - | - |
+| 0.7406 | 394 | 0.0831 | - | - | - | - | - | - |
+| 0.7425 | 395 | 0.2145 | - | - | - | - | - | - |
+| 0.7444 | 396 | 0.1798 | - | - | - | - | - | - |
+| 0.7462 | 397 | 0.2372 | - | - | - | - | - | - |
+| 0.7481 | 398 | 0.2344 | - | - | - | - | - | - |
+| 0.75 | 399 | 0.1169 | - | - | - | - | - | - |
+| 0.7519 | 400 | 0.1729 | 1.0392 | 0.9174 | 0.9175 | 0.9174 | 0.9163 | 0.9151 |
+| 0.7538 | 401 | 0.2767 | - | - | - | - | - | - |
+| 0.7556 | 402 | 0.0738 | - | - | - | - | - | - |
+| 0.7575 | 403 | 0.2413 | - | - | - | - | - | - |
+| 0.7594 | 404 | 0.2307 | - | - | - | - | - | - |
+| 0.7613 | 405 | 0.2238 | - | - | - | - | - | - |
+| 0.7632 | 406 | 0.264 | - | - | - | - | - | - |
+| 0.7650 | 407 | 0.2212 | - | - | - | - | - | - |
+| 0.7669 | 408 | 0.1936 | - | - | - | - | - | - |
+| 0.7688 | 409 | 0.0843 | - | - | - | - | - | - |
+| 0.7707 | 410 | 0.1398 | - | - | - | - | - | - |
+| 0.7726 | 411 | 0.2536 | - | - | - | - | - | - |
+| 0.7744 | 412 | 0.2524 | - | - | - | - | - | - |
+| 0.7763 | 413 | 0.0817 | - | - | - | - | - | - |
+| 0.7782 | 414 | 0.187 | - | - | - | - | - | - |
+| 0.7801 | 415 | 0.2202 | - | - | - | - | - | - |
+| 0.7820 | 416 | 0.4688 | - | - | - | - | - | - |
+| 0.7838 | 417 | 0.2748 | - | - | - | - | - | - |
+| 0.7857 | 418 | 0.1784 | - | - | - | - | - | - |
+| 0.7876 | 419 | 0.181 | - | - | - | - | - | - |
+| 0.7895 | 420 | 0.3211 | - | - | - | - | - | - |
+| 0.7914 | 421 | 0.1609 | - | - | - | - | - | - |
+| 0.7932 | 422 | 0.1783 | - | - | - | - | - | - |
+| 0.7951 | 423 | 0.2027 | - | - | - | - | - | - |
+| 0.7970 | 424 | 0.3005 | - | - | - | - | - | - |
+| 0.7989 | 425 | 0.0396 | - | - | - | - | - | - |
+| 0.8008 | 426 | 0.0633 | - | - | - | - | - | - |
+| 0.8026 | 427 | 0.2468 | - | - | - | - | - | - |
+| 0.8045 | 428 | 0.1822 | - | - | - | - | - | - |
+| 0.8064 | 429 | 0.4503 | - | - | - | - | - | - |
+| 0.8083 | 430 | 0.0755 | - | - | - | - | - | - |
+| 0.8102 | 431 | 0.1746 | - | - | - | - | - | - |
+| 0.8120 | 432 | 0.1353 | - | - | - | - | - | - |
+| 0.8139 | 433 | 0.0427 | - | - | - | - | - | - |
+| 0.8158 | 434 | 0.2745 | - | - | - | - | - | - |
+| 0.8177 | 435 | 0.1701 | - | - | - | - | - | - |
+| 0.8195 | 436 | 0.1108 | - | - | - | - | - | - |
+| 0.8214 | 437 | 0.1247 | - | - | - | - | - | - |
+| 0.8233 | 438 | 0.2483 | - | - | - | - | - | - |
+| 0.8252 | 439 | 0.2491 | - | - | - | - | - | - |
+| 0.8271 | 440 | 0.2228 | - | - | - | - | - | - |
+| 0.8289 | 441 | 0.339 | - | - | - | - | - | - |
+| 0.8308 | 442 | 0.2636 | - | - | - | - | - | - |
+| 0.8327 | 443 | 0.1255 | - | - | - | - | - | - |
+| 0.8346 | 444 | 0.2707 | - | - | - | - | - | - |
+| 0.8365 | 445 | 0.0358 | - | - | - | - | - | - |
+| 0.8383 | 446 | 0.1194 | - | - | - | - | - | - |
+| 0.8402 | 447 | 0.2849 | - | - | - | - | - | - |
+| 0.8421 | 448 | 0.1339 | - | - | - | - | - | - |
+| 0.8440 | 449 | 0.2603 | - | - | - | - | - | - |
+| 0.8459 | 450 | 0.108 | 1.0451 | 0.9167 | 0.9167 | 0.9166 | 0.9163 | 0.9174 |
+| 0.8477 | 451 | 0.1248 | - | - | - | - | - | - |
+| 0.8496 | 452 | 0.1983 | - | - | - | - | - | - |
+| 0.8515 | 453 | 0.2077 | - | - | - | - | - | - |
+| 0.8534 | 454 | 0.2199 | - | - | - | - | - | - |
+| 0.8553 | 455 | 0.0839 | - | - | - | - | - | - |
+| 0.8571 | 456 | 0.2924 | - | - | - | - | - | - |
+| 0.8590 | 457 | 0.1466 | - | - | - | - | - | - |
+| 0.8609 | 458 | 0.3597 | - | - | - | - | - | - |
+| 0.8628 | 459 | 0.1387 | - | - | - | - | - | - |
+| 0.8647 | 460 | 0.1788 | - | - | - | - | - | - |
+| 0.8665 | 461 | 0.2746 | - | - | - | - | - | - |
+| 0.8684 | 462 | 0.2969 | - | - | - | - | - | - |
+| 0.8703 | 463 | 0.2054 | - | - | - | - | - | - |
+| 0.8722 | 464 | 0.2496 | - | - | - | - | - | - |
+| 0.8741 | 465 | 0.2611 | - | - | - | - | - | - |
+| 0.8759 | 466 | 0.1439 | - | - | - | - | - | - |
+| 0.8778 | 467 | 0.1146 | - | - | - | - | - | - |
+| 0.8797 | 468 | 0.1646 | - | - | - | - | - | - |
+| 0.8816 | 469 | 0.1293 | - | - | - | - | - | - |
+| 0.8835 | 470 | 0.3097 | - | - | - | - | - | - |
+| 0.8853 | 471 | 0.2038 | - | - | - | - | - | - |
+| 0.8872 | 472 | 0.2284 | - | - | - | - | - | - |
+| 0.8891 | 473 | 0.3448 | - | - | - | - | - | - |
+| 0.8910 | 474 | 0.2148 | - | - | - | - | - | - |
+| 0.8929 | 475 | 0.2807 | - | - | - | - | - | - |
+| 0.8947 | 476 | 0.29 | - | - | - | - | - | - |
+| 0.8966 | 477 | 0.2555 | - | - | - | - | - | - |
+| 0.8985 | 478 | 0.2942 | - | - | - | - | - | - |
+| 0.9004 | 479 | 0.1309 | - | - | - | - | - | - |
+| 0.9023 | 480 | 0.1965 | - | - | - | - | - | - |
+| 0.9041 | 481 | 0.0971 | - | - | - | - | - | - |
+| 0.9060 | 482 | 0.2923 | - | - | - | - | - | - |
+| 0.9079 | 483 | 0.2019 | - | - | - | - | - | - |
+| 0.9098 | 484 | 0.1065 | - | - | - | - | - | - |
+| 0.9117 | 485 | 0.212 | - | - | - | - | - | - |
+| 0.9135 | 486 | 0.3035 | - | - | - | - | - | - |
+| 0.9154 | 487 | 0.2386 | - | - | - | - | - | - |
+| 0.9173 | 488 | 0.1342 | - | - | - | - | - | - |
+| 0.9192 | 489 | 0.1798 | - | - | - | - | - | - |
+| 0.9211 | 490 | 0.2655 | - | - | - | - | - | - |
+| 0.9229 | 491 | 0.155 | - | - | - | - | - | - |
+| 0.9248 | 492 | 0.2283 | - | - | - | - | - | - |
+| 0.9267 | 493 | 0.098 | - | - | - | - | - | - |
+| 0.9286 | 494 | 0.2384 | - | - | - | - | - | - |
+| 0.9305 | 495 | 0.0843 | - | - | - | - | - | - |
+| 0.9323 | 496 | 0.0996 | - | - | - | - | - | - |
+| 0.9342 | 497 | 0.2008 | - | - | - | - | - | - |
+| 0.9361 | 498 | 0.2017 | - | - | - | - | - | - |
+| 0.9380 | 499 | 0.215 | - | - | - | - | - | - |
+| 0.9398 | 500 | 0.2233 | 1.0156 | 0.9182 | 0.9182 | 0.9182 | 0.9173 | 0.9181 |
+| 0.9417 | 501 | 0.3899 | - | - | - | - | - | - |
+| 0.9436 | 502 | 0.1012 | - | - | - | - | - | - |
+| 0.9455 | 503 | 0.4322 | - | - | - | - | - | - |
+| 0.9474 | 504 | 0.2699 | - | - | - | - | - | - |
+| 0.9492 | 505 | 0.3275 | - | - | - | - | - | - |
+| 0.9511 | 506 | 0.2196 | - | - | - | - | - | - |
+| 0.9530 | 507 | 0.1193 | - | - | - | - | - | - |
+| 0.9549 | 508 | 0.0748 | - | - | - | - | - | - |
+| 0.9568 | 509 | 0.2532 | - | - | - | - | - | - |
+| 0.9586 | 510 | 0.2517 | - | - | - | - | - | - |
+| 0.9605 | 511 | 0.1423 | - | - | - | - | - | - |
+| 0.9624 | 512 | 0.2196 | - | - | - | - | - | - |
+| 0.9643 | 513 | 0.177 | - | - | - | - | - | - |
+| 0.9662 | 514 | 0.3111 | - | - | - | - | - | - |
+| 0.9680 | 515 | 0.1433 | - | - | - | - | - | - |
+| 0.9699 | 516 | 0.279 | - | - | - | - | - | - |
+| 0.9718 | 517 | 0.1455 | - | - | - | - | - | - |
+| 0.9737 | 518 | 0.135 | - | - | - | - | - | - |
+| 0.9756 | 519 | 0.2181 | - | - | - | - | - | - |
+| 0.9774 | 520 | 0.1378 | - | - | - | - | - | - |
+| 0.9793 | 521 | 0.207 | - | - | - | - | - | - |
+| 0.9812 | 522 | 0.1857 | - | - | - | - | - | - |
+| 0.9831 | 523 | 0.2228 | - | - | - | - | - | - |
+| 0.9850 | 524 | 0.0977 | - | - | - | - | - | - |
+| 0.9868 | 525 | 0.0472 | - | - | - | - | - | - |
+| 0.9887 | 526 | 0.4102 | - | - | - | - | - | - |
+| 0.9906 | 527 | 0.2662 | - | - | - | - | - | - |
+
+
+
+### Framework Versions
+- Python: 3.11.11
+- Sentence Transformers: 3.4.1
+- Transformers: 4.51.1
+- PyTorch: 2.5.1+cu124
+- Accelerate: 1.3.0
+- Datasets: 3.5.0
+- Tokenizers: 0.21.0
+
+## Citation
+
+### BibTeX
+
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+ author = "Reimers, Nils and Gurevych, Iryna",
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+ month = "11",
+ year = "2019",
+ publisher = "Association for Computational Linguistics",
+ url = "https://arxiv.org/abs/1908.10084",
+}
+```
+
+#### MatryoshkaLoss
+```bibtex
+@misc{kusupati2024matryoshka,
+ title={Matryoshka Representation Learning},
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
+ year={2024},
+ eprint={2205.13147},
+ archivePrefix={arXiv},
+ primaryClass={cs.LG}
+}
+```
+
+
+
+
+
+
\ No newline at end of file