st-scale70 / README.md
Rich740804's picture
Upload folder using huggingface_hub
15ae283 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:18963
  - loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/paraphrase-mpnet-base-v2
widget:
  - source_sentence: >-
      If the comatose man had previously expressed a desire to be euthanized in
      such a situation, respecting his autonomy would support euthanasia.
    sentences:
      - >-
        If the comatose man had previously expressed a desire for euthanasia in
        such circumstances, there may be a duty to respect his autonomy, which
        would support the action.
      - >-
        If the man is believed to be suffering in his comatose state or there is
        a significant burden on his family, there may be a duty to alleviate
        suffering that supports euthanasia.
      - >-
        As a living being, the rat may warrant a duty of care from humans, which
        may include providing it with appropriate medical treatment or humane
        euthanasia in case of suffering.
  - source_sentence: >-
      Resisting authoritarianism can defend individual freedom and undermine
      oppressive regimes.
    sentences:
      - >-
        Resisting authoritarianism can be a means of exercising the right to
        free speech and expression, which may be suppressed by the government.
      - >-
        If retreating serves to protect the lives of soldiers and civilians,
        then it upholds the value of the duty to protect.
      - >-
        Resisting authoritarianism could result in negative consequences for
        safety and security if violence is used to resist.
  - source_sentence: >-
      Saving someone upholds their fundamental right to life, as it prevents
      them from experiencing harm or death.
    sentences:
      - >-
        Donating the money to charity has the potential to benefit those in need
        and can be seen as fulfilling a duty to improve the well-being of
        others.
      - >-
        Saving someone may preserve their freedom and ability to make choices in
        their life.
      - >-
        If saving someone involves protecting their body from injury or harm,
        their right to bodily integrity is respected.
  - source_sentence: >-
      Helping those in need, such as a starving person, promotes a sense of
      community and responsibility towards fellow humans.
    sentences:
      - >-
        We have a moral responsibility to treat others with respect and dignity,
        regardless of their race. Hanging out with black people allows for the
        opportunity to demonstrate this respect.
      - >-
        A starving person's right to life is at stake, and providing them with
        food can help protect this fundamental right.
      - >-
        Providing aid and resources to someone in need is an expression of the
        duty to promote the well-being of others.
  - source_sentence: >-
      The marriage of Baptiste and Hannah demonstrates their commitment to
      sharing their lives and supporting one another.
    sentences:
      - >-
        Helping others may be a moral duty, but using unethical means like
        cheating goes against other moral principles.
      - >-
        If the marriage brings happiness to Baptiste and Hannah, then they are
        pursuing their right to happiness.
      - >-
        By getting married, Baptiste and Hannah take on a duty to care for each
        other, both emotionally and materially.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on sentence-transformers/paraphrase-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-mpnet-base-v2 on the train dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/paraphrase-mpnet-base-v2
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • train

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'The marriage of Baptiste and Hannah demonstrates their commitment to sharing their lives and supporting one another.',
    'By getting married, Baptiste and Hannah take on a duty to care for each other, both emotionally and materially.',
    'If the marriage brings happiness to Baptiste and Hannah, then they are pursuing their right to happiness.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

train

  • Dataset: train
  • Size: 18,963 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 10 tokens
    • mean: 25.92 tokens
    • max: 51 tokens
    • min: 9 tokens
    • mean: 28.31 tokens
    • max: 60 tokens
    • min: 11 tokens
    • mean: 28.69 tokens
    • max: 67 tokens
  • Samples:
    anchor positive negative
    Saving the group of people from harm by diverting the trolley supports the value of preserving life. The group of people tied to the tracks have a right to life, which is protected when the trolley is diverted to save them. Diverting the trolley reduces overall harm by preventing the deaths of many people at the cost of one person's life.
    The bake sale could be seen as an expression of support for a particular cause, and the right to freely express oneself and associate with others who share the same views is important. The bake sale might be seen as a form of protest or support for a specific cause, and individuals have the right to engage in peaceful protest or show support. If the bake sale directly or indirectly promotes religious discrimination, this can infringe on the fundamental right of individuals to be free from discrimination or harm due to their religious beliefs.
    Children have a right to life, and saving them from danger upholds this right. Children should be protected from harm, abuse, and danger, and saving them ensures this right is respected. Children have a right to grow up with access to healthcare, education, and a nurturing environment. Saving them may help secure these rights.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 40,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • overwrite_output_dir: True
  • per_device_train_batch_size: 32
  • learning_rate: 2.1456771788455288e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.03254893834779507
  • fp16: True
  • dataloader_num_workers: 4
  • remove_unused_columns: False

All Hyperparameters

Click to expand
  • overwrite_output_dir: True
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2.1456771788455288e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.03254893834779507
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: False
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
0.0337 20 0.2448
0.0675 40 0.1918
0.1012 60 0.14
0.1349 80 0.186
0.1686 100 0.1407
0.2024 120 0.1672
0.2361 140 0.1832
0.2698 160 0.116
0.3035 180 0.1341
0.3373 200 0.2118
0.3710 220 0.1274
0.4047 240 0.1993
0.4384 260 0.1561
0.4722 280 0.1517
0.5059 300 0.1635
0.5396 320 0.1646
0.5734 340 0.1337
0.6071 360 0.1406
0.6408 380 0.1114
0.6745 400 0.1314
0.7083 420 0.1481
0.7420 440 0.1932
0.7757 460 0.1568
0.8094 480 0.1319
0.8432 500 0.1536
0.8769 520 0.1462
0.9106 540 0.1336
0.9444 560 0.1453
0.9781 580 0.2005
1.0118 600 0.1265
1.0455 620 0.0702
1.0793 640 0.0739
1.1130 660 0.049
1.1467 680 0.0613
1.1804 700 0.0663
1.2142 720 0.0726
1.2479 740 0.0822
1.2816 760 0.0651
1.3153 780 0.0603
1.3491 800 0.0468
1.3828 820 0.061
1.4165 840 0.0891
1.4503 860 0.0607
1.4840 880 0.0673
1.5177 900 0.0728
1.5514 920 0.065
1.5852 940 0.0824
1.6189 960 0.0695
1.6526 980 0.0626
1.6863 1000 0.0525
1.7201 1020 0.0482
1.7538 1040 0.0968
1.7875 1060 0.0717
1.8212 1080 0.0704
1.8550 1100 0.0666
1.8887 1120 0.0841
1.9224 1140 0.0682
1.9562 1160 0.0584
1.9899 1180 0.0423

Framework Versions

  • Python: 3.9.21
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.4.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}