LorMolf's picture
Add final trained model
8fe684b verified
---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:30000
- loss:MultipleNegativesRankingLoss
base_model: BAAI/bge-base-en-v1.5
widget:
- source_sentence: I need to plan a vacation from June 1st to June 10th, 2023 in the
United States. Can you provide me with a list of non-working days during this
period? Additionally, analyze the availability of events and obtain the details
of the first event on June 3rd. Also, check the responses for this event.
sentences:
- "def diablo4_smartable_getsponsorships:\n\t\"\"\"\n\tDescription:\n\tGet Diablo\
\ 4 sponsorships.\n\n\tArguments:\n\t---------\n\t\"\"\""
- "def YTStream_-_Download_YouTube_Videos.Download/Stream:\n\t\"\"\"\n\tDescription:\n\
\tStream or download info.\n\n\tArguments:\n\t---------\n\t- id : STRING (required)\n\
\t Description: Youtube Video Id.\n\t Default: UxxajLWwzqY\n\t\"\"\""
- "def 31Events_-_Send_Native_Calendar_Invites.EventGet:\n\t\"\"\"\n\t\n\n\tArguments:\n\
\t---------\n\t- event_id : STRING (required)\n\t Description: Event ID\n\t \
\ Default: 1\n\t\"\"\""
- source_sentence: I'm a student working on a research project about climate change.
Help me find some scientific articles and discussions on Reddit related to climate
change. Provide me with the top comments so that I can understand different perspectives.
Additionally, suggest some popular posts and their details that I can reference
in my project.
sentences:
- "def reddit_posts_by_username:\n\t\"\"\"\n\tDescription:\n\tPosts By Username\n\
\n\tArguments:\n\t---------\n\t- username : STRING (required)\n\t Default: GoldenChrysus\n\
\t- sort : STRING (required)\n\t Description: you can just send `new `or `hot`\n\
\t Default: new\n\t\"\"\""
- "def microsoft_translator_text_languages:\n\t\"\"\"\n\tDescription:\n\tGets the\
\ set of languages currently supported by other operations of the Translator Text\
\ API.\n\n\tArguments:\n\t---------\n\t- api-version : STRING (required)\n\t \
\ Description: Version of the API requested by the client. Value must be **3.0**.\n\
\t Default: 3.0\n\t\"\"\""
- "def socialgrep_post_search:\n\t\"\"\"\n\tDescription:\n\tSearches Reddit posts.\n\
\n\tArguments:\n\t---------\n\t- query : STRING (required)\n\t Description: The\
\ comma-separated query for the search. Supports the following term types:\n\t\
\n\t`site:{site_name}` - search only posts where the domain matches {site_name}.\n\
\t\n\t`-site:{site_name}` - search only posts where the domain does not match\
\ {site_name}.\n\t\n\t`/r/{subreddit}` - search only posts from the subreddit\
\ {subreddit}.\n\t\n\t`-/r/{subreddit}` - search only posts not from the subreddit\
\ {subreddit}.\n\t\n\t`{term}` - search only posts with titles containing the\
\ term {term}.\n\t\n\t`-{term}` - search only posts with titles not containing\
\ the term {term}.\n\t\n\t`score:{score}` - search only posts with score at least\
\ {score}.\n\t\n\t`before:{YYYY-mm-dd}`, `after:{YYYY-mm-dd}` - search only posts\
\ within the date range. `before` is inclusive, `after` is not.\n\t Default:\
\ /r/funny,cat\n\t\"\"\""
- source_sentence: I'm planning a weekend getaway with my friends and we want to watch
a football match. Can you provide me with the list of fixture IDs for the matches
scheduled for next month? Also, show me the league table and stats for the home
team of the match with ID 81930. Additionally, scrape the contacts from the website
of the home team to get their email and social media profiles.
sentences:
- "def coinranking_get_coin_exchanges:\n\t\"\"\"\n\tDescription:\n\tFind exchanges\
\ where a specific coin can be traded.\n\tThis endpoint requires the **ultra**\
\ plan or higher.\n\n\tArguments:\n\t---------\n\t- uuid : string (required)\n\
\t Description: UUID of the coin you want to request exchanges for\n\t Default:\
\ Qwsogvtv82FCd\n\t\"\"\""
- "def football_prediction_get_list_of_fixture_ids:\n\t\"\"\"\n\tDescription:\n\t\
Returns a list of fixture IDs that can be used to make requests to endpoints expecting\
\ a ID url parameter.\n\tCan be filtered by:\n\t\n\t- iso_date\n\t- market\n\t\
- federation\n\n\tArguments:\n\t---------\n\t\"\"\""
- "def open_brewery_db_breweries:\n\t\"\"\"\n\tDescription:\n\tList of Breweries\n\
\n\tArguments:\n\t---------\n\t\"\"\""
- source_sentence: I'm organizing a movie marathon and I need a mix of genres. Can
you recommend highly rated movies from various genres available on streaming services
like Netflix, Prime Video, Hulu, and Peacock? Additionally, provide me with the
TV schedule for tonight's movies.
sentences:
- "def streaming_availability_genres_free:\n\t\"\"\"\n\tDescription:\n\tGet the\
\ id to name mapping of supported genres.\n\n\tArguments:\n\t---------\n\t\"\"\
\""
- "def solcast_simple_pv_power_forecast:\n\t\"\"\"\n\tDescription:\n\tThe simple\
\ PV power request returns a first-guess PV power output forecast, based on your\
\ specified latitude and longitude plus some basic PV system characteristics.\n\
\n\tArguments:\n\t---------\n\t- capacity : NUMBER (required)\n\t Description:\
\ The capacity of the system, in Watts.\n\t Default: 0\n\t- latitude : NUMBER\
\ (required)\n\t Description: Latitude\n\t- longitude : NUMBER (required)\n\t\
\ Description: Longitude\n\t\"\"\""
- "def kargom_nerede_companies:\n\t\"\"\"\n\tDescription:\n\tCompanies\n\n\tArguments:\n\
\t---------\n\t\"\"\""
- source_sentence: I'm a food blogger and I need some interesting facts for my next
article. Fetch a random fact about a specific number and provide a historical
fact about a famous year. Additionally, recommend a genre of music to set the
mood for writing.
sentences:
- "def dicolink_get_lexical_field:\n\t\"\"\"\n\tDescription:\n\tGet Lexical Field\
\ for a word\n\n\tArguments:\n\t---------\n\t- mot : string (required)\n\t Default:\
\ cheval\n\t\"\"\""
- "def geodb_cities_places_near_location:\n\t\"\"\"\n\tDescription:\n\tGet places\
\ near the given location, filtering by optional criteria.\n\n\tArguments:\n\t\
---------\n\t- radius : STRING (required)\n\t Description: The location radius\
\ within which to find places\n\t Default: 100\n\t- locationid : STRING (required)\n\
\t Description: Only cities near this location. Latitude/longitude in ISO-6709\
\ format: ±DD.DDDD±DDD.DDDD\n\t Default: 33.832213-118.387099\n\t\"\"\""
- "def deezer_genre:\n\t\"\"\"\n\tDescription:\n\tA genre object\n\n\tArguments:\n\
\t---------\n\t- id : STRING (required)\n\t Description: The editorial's Deezer\
\ id\n\t\"\"\""
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@1
- cosine_ndcg@3
- cosine_ndcg@5
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on BAAI/bge-base-en-v1.5
results:
- task:
type: device-aware-information-retrieval
name: Device Aware Information Retrieval
dataset:
name: dev
type: dev
metrics:
- type: cosine_accuracy@1
value: 0.66
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.82
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.88
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.95
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.66
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.4833333333333333
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.3600000000000001
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.21800000000000005
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.25066666666666665
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.5509999999999999
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.6613333333333332
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.7801666666666667
name: Cosine Recall@10
- type: cosine_ndcg@1
value: 0.66
name: Cosine Ndcg@1
- type: cosine_ndcg@3
value: 0.582592770063282
name: Cosine Ndcg@3
- type: cosine_ndcg@5
value: 0.633788337516139
name: Cosine Ndcg@5
- type: cosine_ndcg@10
value: 0.6889055410848939
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7467063492063493
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.629168458376448
name: Cosine Map@100
---
# SentenceTransformer based on BAAI/bge-base-en-v1.5
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("LorMolf/mnrl-toolbench-bge-base-en-v1.5")
# Run inference
sentences = [
"I'm a food blogger and I need some interesting facts for my next article. Fetch a random fact about a specific number and provide a historical fact about a famous year. Additionally, recommend a genre of music to set the mood for writing.",
'def deezer_genre:\n\t"""\n\tDescription:\n\tA genre object\n\n\tArguments:\n\t---------\n\t- id : STRING (required)\n\t Description: The editorial\'s Deezer id\n\t"""',
'def dicolink_get_lexical_field:\n\t"""\n\tDescription:\n\tGet Lexical Field for a word\n\n\tArguments:\n\t---------\n\t- mot : string (required)\n\t Default: cheval\n\t"""',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
## Evaluation
### Metrics
#### Device Aware Information Retrieval
* Dataset: `dev`
* Evaluated with <code>src.port.retrieval_evaluator.DeviceAwareInformationRetrievalEvaluator</code>
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.66 |
| cosine_accuracy@3 | 0.82 |
| cosine_accuracy@5 | 0.88 |
| cosine_accuracy@10 | 0.95 |
| cosine_precision@1 | 0.66 |
| cosine_precision@3 | 0.4833 |
| cosine_precision@5 | 0.36 |
| cosine_precision@10 | 0.218 |
| cosine_recall@1 | 0.2507 |
| cosine_recall@3 | 0.551 |
| cosine_recall@5 | 0.6613 |
| cosine_recall@10 | 0.7802 |
| cosine_ndcg@1 | 0.66 |
| cosine_ndcg@3 | 0.5826 |
| cosine_ndcg@5 | 0.6338 |
| **cosine_ndcg@10** | **0.6889** |
| cosine_mrr@10 | 0.7467 |
| cosine_map@100 | 0.6292 |
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 30,000 training samples
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
* Approximate statistics based on the first 1000 samples:
| | sentence_0 | sentence_1 | sentence_2 |
|:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
| type | string | string | string |
| details | <ul><li>min: 29 tokens</li><li>mean: 61.36 tokens</li><li>max: 133 tokens</li></ul> | <ul><li>min: 28 tokens</li><li>mean: 82.89 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 26 tokens</li><li>mean: 85.81 tokens</li><li>max: 512 tokens</li></ul> |
* Samples:
| sentence_0 | sentence_1 | sentence_2 |
|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>My family and I are planning a beach vacation in Florida next month. Can you provide us with the current weather conditions, active alerts, and station details for Miami and Orlando? Additionally, we would like a radiation forecast to plan our outdoor activities.</code> | <code>def solcast_simple_radiation_forecast:<br> """<br> Description:<br> The simple radiation request returns detailed solar radiation data for the next week based only on your latitude and longitude.<br><br> Arguments:<br> ---------<br> - latitude : NUMBER (required)<br> Description: Latitude<br> - longitude : NUMBER (required)<br> Description: Longitude<br> """</code> | <code>def uk_boundaries_io_retrieve_uk_postal_district_outline_boundaries:<br> """<br> Description:<br> example: Query by "TW12" district<br><br> Arguments:<br> ---------<br> - postal-district : STRING (required)<br> Description: Query by postal district code.<br> Default: TW12<br> """</code> |
| <code>I am planning a vacation to a tropical destination and I need some information to make the most of my trip. Can you please provide me with the current weather data for a location with latitude 25.5 and longitude -80.5? Additionally, I would like a 16-day forecast for this location. Furthermore, I am interested in knowing the available cities in the country associated with this location. Lastly, please suggest some popular tourist attractions in this country.</code> | <code>def weather_forecast_14_days_list_of_cities_in_one_country:<br> """<br> Description:<br> List of cities in one Country<br><br> Arguments:<br> ---------<br> """</code> | <code>def Billboard-API.Brazil_Songs:<br> """<br> Description:<br> Provide the Brazil Songs chart information<br><br> Arguments:<br> ---------<br> - date : DATE (YYYY-MM-DD) (required)<br> Description: date format(YYYY-MM-DD)<br> Default: 2022-05-07<br> - range : STRING (required)<br> Default: 1-10<br> """</code> |
| <code>I want to surprise my family with a special dinner tonight. Can you suggest some quick and easy recipes for a main course? Also, provide me with the list of ingredients required for each recipe. Additionally, I would like to know the plant hardiness zone for our area, which is zip code 90210.</code> | <code>def yummly_feeds_list:<br> """<br> Description:<br> List feeds by category<br><br> Arguments:<br> ---------<br> - start : NUMBER (required)<br> Description: The offset of items to be ignored in response for paging<br> Default: 0<br> - limit : NUMBER (required)<br> Description: Number of items returned per response<br> Default: 24<br> """</code> | <code>def line_messaging_get_number_of_sent_reply_messages:<br> """<br> Description:<br> Gets the number of messages sent with the /bot/message/reply endpoint.<br><br> Arguments:<br> ---------<br> - date : STRING (required)<br> Description: Date the messages were sent. Format: yyyyMMdd (Example: 20191231) Timezone: UTC+9<br> """</code> |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 2
- `per_device_eval_batch_size`: 2
- `num_train_epochs`: 1
- `fp16`: True
- `multi_dataset_batch_sampler`: round_robin
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 2
- `per_device_eval_batch_size`: 2
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 1
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `tp_size`: 0
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: round_robin
</details>
### Training Logs
| Epoch | Step | Training Loss | dev_cosine_ndcg@10 |
|:------:|:-----:|:-------------:|:------------------:|
| -1 | -1 | - | 0.6031 |
| 0.0333 | 500 | 0.3188 | - |
| 0.0667 | 1000 | 0.2079 | - |
| 0.1 | 1500 | 0.2178 | - |
| 0.1333 | 2000 | 0.186 | - |
| 0.1667 | 2500 | 0.1665 | - |
| 0.2 | 3000 | 0.205 | 0.6953 |
| 0.2333 | 3500 | 0.149 | - |
| 0.2667 | 4000 | 0.1691 | - |
| 0.3 | 4500 | 0.1703 | - |
| 0.3333 | 5000 | 0.1588 | - |
| 0.3667 | 5500 | 0.1348 | - |
| 0.4 | 6000 | 0.1625 | 0.6639 |
| 0.4333 | 6500 | 0.1415 | - |
| 0.4667 | 7000 | 0.13 | - |
| 0.5 | 7500 | 0.1271 | - |
| 0.5333 | 8000 | 0.1058 | - |
| 0.5667 | 8500 | 0.1031 | - |
| 0.6 | 9000 | 0.1026 | 0.6860 |
| 0.6333 | 9500 | 0.1031 | - |
| 0.6667 | 10000 | 0.1248 | - |
| 0.7 | 10500 | 0.0909 | - |
| 0.7333 | 11000 | 0.1055 | - |
| 0.7667 | 11500 | 0.101 | - |
| 0.8 | 12000 | 0.0598 | 0.6778 |
| 0.8333 | 12500 | 0.0949 | - |
| 0.8667 | 13000 | 0.062 | - |
| 0.9 | 13500 | 0.1129 | - |
| 0.9333 | 14000 | 0.1106 | - |
| 0.9667 | 14500 | 0.0653 | - |
| 1.0 | 15000 | 0.0669 | 0.6889 |
### Framework Versions
- Python: 3.10.12
- Sentence Transformers: 4.0.2
- Transformers: 4.51.2
- PyTorch: 2.6.0+cu124
- Accelerate: 1.6.0
- Datasets: 3.5.0
- Tokenizers: 0.21.1
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->