BGE-base-en-v1.5-Hotpotqa
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the sentence-transformers/hotpotqa dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'James D. Farley, Jr. had an early interest in automobiles because of his grandfather who worked for what company?',
"Jim Farley (businessman) James D. Farley, Jr. (born June 1962) is an American automobile executive that currently serves as Ford Motor Company's Executive Vice President and president, Global Markets since June 2017. From 2015 to 2017, he was CEO and Chairman of Ford Europe. He had an early interest in automobiles, primarily spurred from his grandfather who worked at Henry Ford's River Rouge Plant starting in 1914.",
'Continental Motors Company Continental Motors Company was an American manufacturer of internal combustion engines. The company produced engines as a supplier to many independent manufacturers of automobiles, tractors, trucks, and stationary equipment (such as pumps, generators, and industrial machinery drives) from the 1900s through the 1960s. Continental Motors also produced Continental-branded automobiles in 1932–1933. The Continental Aircraft Engine Company was formed in 1929 to develop and produce its aircraft engines, and would become the core business of Continental Motors, Inc.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Dataset:
dim_768
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.9069 |
dot_accuracy | 0.0931 |
manhattan_accuracy | 0.9066 |
euclidean_accuracy | 0.9069 |
max_accuracy | 0.9069 |
Triplet
- Dataset:
dim_512
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.9075 |
dot_accuracy | 0.0931 |
manhattan_accuracy | 0.9056 |
euclidean_accuracy | 0.9064 |
max_accuracy | 0.9075 |
Triplet
- Dataset:
dim_256
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.9074 |
dot_accuracy | 0.0931 |
manhattan_accuracy | 0.9063 |
euclidean_accuracy | 0.9063 |
max_accuracy | 0.9074 |
Triplet
- Dataset:
dim_128
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.9061 |
dot_accuracy | 0.0949 |
manhattan_accuracy | 0.9014 |
euclidean_accuracy | 0.9036 |
max_accuracy | 0.9061 |
Triplet
- Dataset:
dim_64
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.9055 |
dot_accuracy | 0.0981 |
manhattan_accuracy | 0.8984 |
euclidean_accuracy | 0.9013 |
max_accuracy | 0.9055 |
Training Details
Training Dataset
sentence-transformers/hotpotqa
- Dataset: sentence-transformers/hotpotqa at f07d3cd
- Size: 76,064 training samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 8 tokens
- mean: 24.49 tokens
- max: 108 tokens
- min: 21 tokens
- mean: 101.27 tokens
- max: 512 tokens
- min: 14 tokens
- mean: 87.44 tokens
- max: 407 tokens
- Samples:
anchor positive negative What historical geographic region in Central-Eastern Europe was the birthplace of a soldier of the Austro-Hungarian Army?
Bruno Olbrycht Bruno Olbrycht (nom de guerre: Olza; 6 October 1895 – 23 March 1951) was a soldier of the Austro-Hungarian Army and officer (later general) of the Polish Army both in the Second Polish Republic and postwar Poland. Born on 6 October 1895 in Sanok, Austrian Galicia, Olbrycht fought in Polish Legions in World War I, Polish–Ukrainian War, Polish–Soviet War and the Invasion of Poland. He died on 23 March 1951 in Kraków.
Padáň The village was first recorded in 1254 as "Padan", an old Pecheneg settlement. On the territory of the village, there used to be "Petény" village as well, which was mentioned in 1298 as the appurtenance of Pressburg Castle. Until the end of World War I, it was part of Hungary and fell within the Dunaszerdahely district of Pozsony County. After the Austro-Hungarian army disintegrated in November 1918, Czechoslovakian troops occupied the area. After the Treaty of Trianon of 1920, the village became officially part of Czechoslovakia. In November 1938, the First Vienna Award granted the area to Hungary and it was held by Hungary until 1945. After Soviet occupation in 1945, Czechoslovakian administration returned and the village became officially part of Czechoslovakia in 1947.
Full Scale Assault is the fourth studio album by Dutch punk hardcore band Vitamin X, the album was recorded at Electrical Audio in Chicago by Steve Albini who previously recorded The Stooges, also known as Iggy and the Stooges, were an American rock band formed in Ann Arbor, Michigan in what year?
Full Scale Assault Full Scale Assault is the fourth studio album by Dutch punk hardcore band Vitamin X. Released through Tankcrimes on October 10, 2008 in the US, and Agipunk in Europe. The album was recorded at Electrical Audio in Chicago by Steve Albini who previously recorded Nirvana, Neurosis, PJ Harvey, High on Fire, Iggy Pop & The Stooges. It features guest vocals from Negative Approach's singer John Brannon. Art is by John Dyer Baizley.
The Dogs (US punk band) The Dogs are a three-piece proto-punk band formed in Lansing, Michigan, United States in 1969. They are noted for presaging the energy and sound of the later punk and hardcore genres.
Which popular music style was a modification of the marches from "The March King" with heavy influences from African American communities?
Ragtime Ragtime – also spelled rag-time or rag time – is a musical style that enjoyed its peak popularity between 1895 and 1918. Its cardinal trait is its syncopated, or "ragged", rhythm. The style has its origins in African-American communities in cities such as St. Louis. Ernest Hogan (1865–1909) was a pioneer of ragtime and was the first composer to have his ragtime pieces (or "rags") published as sheet music, beginning with the song "LA Pas Ma LA," published in 1895. Hogan has also been credited for coining the term "ragtime". The term is actually derived from his hometown "Shake Rag" in Bowling Green, Kentucky. Ben Harney, another Kentucky native, has often been credited for introducing the music to the mainstream public. His first ragtime composition, "You've Been a Good Old Wagon But You Done Broke", helped popularize the style. The composition was published in 1895, a few months after Ernest Hogan's "LA Pas Ma LA." Ragtime was also a modification of the march style popularized by John Philip Sousa, with additional polyrhythms coming from African music. Ragtime composer Scott Joplin ("ca." 1868–1917) became famous through the publication of the "Maple Leaf Rag" (1899) and a string of ragtime hits such as "The Entertainer" (1902), although he was later forgotten by all but a small, dedicated community of ragtime aficionados until the major ragtime revival in the early 1970s. For at least 12 years after its publication, "Maple Leaf Rag" heavily influenced subsequent ragtime composers with its melody lines, harmonic progressions or metric patterns.
Joropo The Joropo is a musical style resembling the fandango, and an accompanying dance. It has African, Native South American and European influences and originated in the plains called "Los Llanos" of what is now Colombia and Venezuela. It is a fundamental genre of "música criolla" (creole music). It is also the most popular "folk rhythm": the well-known song "Alma Llanera" is a joropo, considered the unofficial national anthem of Venezuela.
- Loss:
MatryoshkaLoss
with these parameters:{ "loss": "TripletLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Evaluation Dataset
sentence-transformers/hotpotqa
- Dataset: sentence-transformers/hotpotqa at f07d3cd
- Size: 8,452 evaluation samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 8 tokens
- mean: 23.94 tokens
- max: 87 tokens
- min: 16 tokens
- mean: 101.15 tokens
- max: 447 tokens
- min: 12 tokens
- mean: 86.87 tokens
- max: 407 tokens
- Samples:
anchor positive negative What is the birthdate of this American dancer and choreographer of modern dance, who helped found the Joseph Campbell Foundation with Robert Walter?
Robert Walter (editor) Robert Walter is an editor and an executive with several not-for-profit organizations. Most notably, he is the executive director and board president of the Joseph Campbell Foundation (JCF), an organization that he helped found in 1990 with choreographer Jean Erdman, Joseph Campbell's widow.
Miguel Terekhov Miguel Terekhov (August 22, 1928 – January 3, 2012) was a Uruguayan-born American ballet dancer and ballet instructor. Terekhov and his wife, Yvonne Chouteau, one of the Five Moons, a group of Native American ballet dancers, founded the School of Dance at the University of Oklahoma in 1961.
What is the difference between Konstantin Orbelyan and Haig P. Manoogian
Konstantin Orbelyan Konstantin Aghaparoni Orbelyan (Armenian: Կոնստանտին Աղապարոնի Օրբելյան ; Russian: Константин Агапаронович Орбелян , July 29, 1928 – April 24, 2014) was an Armenian pianist, composer, head of the State Estrada Orchestra of Armenia.
Mitrofan Lodyzhensky Mitrofan Vasilyevich Lodyzhensky (Russian: Митрофа́н Васи́льевич Лоды́женский , in some sources Лады́женский (Ladyzhensky ); February 27 [O.S. February 15] 1852 – May 31 [O.S. May 18] 1917 ) was a Russian religious philosopher, playwright, and statesman, best known for his "Mystical Trilogy" comprising "Super-consciousness and the Ways to Achieve It", "Light Invisible", and "Dark Force".
Which movie has more producers, Laura's Star or 9?
Laura's Star Laura's Star (German: Lauras Stern ) is a 2004 German animated feature film produced and directed by Thilo Rothkirch. It is based on the children's book "Lauras Stern" by Klaus Baumgart. It was released by Warner Bros. Family Entertainment.
Laura Mañá Laura Mañá (born January 12, 1968 in Barcelona, Catalonia, Spain) is an actress, film director and screenwriter.
- Loss:
MatryoshkaLoss
with these parameters:{ "loss": "TripletLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 32per_device_eval_batch_size
: 32gradient_accumulation_steps
: 16learning_rate
: 2e-05num_train_epochs
: 5lr_scheduler_type
: cosinewarmup_ratio
: 0.1bf16
: Truetf32
: Trueload_best_model_at_end
: Trueoptim
: adamw_torch_fusedresume_from_checkpoint
: bge-base-hotpotwa-matryoshkabatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 32per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 16eval_accumulation_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Truelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: bge-base-hotpotwa-matryoshkahub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | loss | dim_128_cosine_accuracy | dim_256_cosine_accuracy | dim_512_cosine_accuracy | dim_64_cosine_accuracy | dim_768_cosine_accuracy |
---|---|---|---|---|---|---|---|---|
0.3366 | 50 | 23.6925 | 21.8521 | 0.9285 | 0.9288 | 0.9334 | 0.9226 | 0.9365 |
0.6731 | 100 | 22.4254 | 20.8726 | 0.9102 | 0.9110 | 0.9156 | 0.9063 | 0.9168 |
1.0097 | 150 | 22.046 | 20.7027 | 0.9142 | 0.9162 | 0.9188 | 0.9098 | 0.9200 |
1.3462 | 200 | 21.871 | 20.6600 | 0.9227 | 0.9198 | 0.9233 | 0.9159 | 0.9232 |
1.6828 | 250 | 21.7 | 20.6425 | 0.9193 | 0.9192 | 0.9203 | 0.9148 | 0.9217 |
2.0194 | 300 | 21.5785 | 20.6416 | 0.9113 | 0.9133 | 0.9149 | 0.9082 | 0.9142 |
2.3559 | 350 | 21.4963 | 20.5366 | 0.9141 | 0.9139 | 0.9162 | 0.9107 | 0.9177 |
2.6925 | 400 | 21.4012 | 20.5315 | 0.9103 | 0.9114 | 0.9135 | 0.9081 | 0.9136 |
3.0290 | 450 | 21.3447 | 20.5096 | 0.9093 | 0.9089 | 0.9102 | 0.9057 | 0.9106 |
3.3656 | 500 | 21.3029 | 20.5548 | 0.9061 | 0.9074 | 0.9075 | 0.9055 | 0.9069 |
Framework Versions
- Python: 3.10.10
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.1.2+cu121
- Accelerate: 0.31.0
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for anindya-hf-2002/bge-base-finetuned-hotpotqa
Base model
BAAI/bge-base-en-v1.5Dataset used to train anindya-hf-2002/bge-base-finetuned-hotpotqa
Evaluation results
- Cosine Accuracy on dim 768self-reported0.907
- Dot Accuracy on dim 768self-reported0.093
- Manhattan Accuracy on dim 768self-reported0.907
- Euclidean Accuracy on dim 768self-reported0.907
- Max Accuracy on dim 768self-reported0.907
- Cosine Accuracy on dim 512self-reported0.907
- Dot Accuracy on dim 512self-reported0.093
- Manhattan Accuracy on dim 512self-reported0.906
- Euclidean Accuracy on dim 512self-reported0.906
- Max Accuracy on dim 512self-reported0.907