LexEmbed-Contracts / README.md
yasserrmd's picture
Upload folder using huggingface_hub
a5f86fa verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:16129
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: Rofr/Rofo/Rofn
    sentences:
      - >-
        between the parties is not executed within thirty (30) days following
        delivery, of such notice to Snap, Snap shall be free thereafter to enter
        into an such an agreement with any third party.
      - >-
        This Agreement contains the entire agreement of the parties and SYNTEL
        shall not be bound by any other different, additional, or further
        agreements or understandings except as consented to in writing by the
        Chief Administrative Officer or Director, Human Resources of SYNTEL.
        This Agreement shall be binding upon and inure to the benefit of the
        parties hereto and their respective successors and assigns. No amendment
        hereof shall be effective unless contained in a written instrument
        signed by the parties hereto. No delay or omission by either party to
        exercise any right or power under this Agreement shall impair such right
        or power or be construed to be a waiver thereof. A waiver by either
        party of any of the covenants to be performed by the other party or of
        any breach shall not be construed to be a waiver of any succeeding
        breach or of any other covenant. If any portion of any provision of the
        Agreement is declared invalid, the offending portion of such provision
        shall be deemed severable from such provision and the remaining
        provisions of the Agreement, which shall remain in full force and
        effect. EMPLOYEE shall not assign or transfer this Agreement without the
        prior written consent of SYNTEL. EMPLOYEE’s employment with SYNTEL is at
        will and may be terminated by SYNTEL at any time with or without cause,
        and with or without notice. All rights and remedies provided for in this
        Agreement shall be cumulative and in addition to and not in lieu of any
        other rights or remedies available to either party at law, in equity, or
        otherwise. Paragraphs 2, 3, 6, 7, 8, 9, 10, 11, 12, and 13 of this
        Agreement shall survive termination of this Agreement and EMPLOYEE’s
        employment with SYNTEL. The parties submit to the jurisdiction and venue
        of the circuit court for the County of Oakland, State of Michigan or, if
        original jurisdiction can be established, the United States District
        Court for the Eastern District of Michigan with respect to: a) disputes,
        controversies, or claims arising out of EMPLOYEE’S failure to abide by
        Paragraphs 6, 7, and/or Exhibit A – “Confidential Information” of this
        Agreement, b) claims initiated by SYNTEL pursuant to Paragraph 10 of
        this Agreement, and c) the enforcement of any awards or relief granted
        pursuant to the dispute resolution procedures set forth in Paragraph 11
        of this Agreement. The parties stipulate that the venues referenced in
        this Agreement are convenient. This Agreement shall be construed under
        and in accordance with the laws of the State of Michigan.
      - "The existence and terms of this Term Sheet are “Confidential Information” under and subject to the terms of the Confidentiality Agreement, dated February 23, 2016 (as amended on August 16, 2016, the “ Confidentiality Agreement ”), between CHC Leasing (Ireland) Limited and The Milestone Aviation Group Limited. The parties confirm that the Confidentiality Agreement remains in full force and effect; provided , however, the parties (i) agree that each party may disclose Confidential Information to the professional advisers retained by the Committee and (ii) agree to work in good faith to amend the Confidentiality Agreement to permit certain participants in the Chapter 11 Case (as agreed to by the parties) to view a partially redacted version of this Term Sheet. In addition, as each of the parties hereto acknowledges that this Term Sheet is itself, and this Term Sheet contains, commercially sensitive and proprietary information, with respect to the Chapter\_11 Case, each of the parties agrees to maintain this Term Sheet and this information strictly confidential, and agrees to disclose it to no person other than: (i) the parties to the Plan Support Agreement (ii) any person that has executed an accession and joinder to the Confidentiality Agreement in the form appended thereto, (iii) the Bankruptcy Court during the course of the Chapter\_11 Case, provided , however, that no document relating to the proposed transactions (including this Term Sheet) shall be filed with the Bankruptcy Court (other than a motion, in form and substance acceptable to the CHC Parties and the Milestone Parties, seeking protective order authority to file this Term Sheet under seal, which motion shall not describe the specific economic elements of the transaction) unless either (x)\_there has been obtained prior to the filing thereof an order of the Bankruptcy Court acceptable to the Milestone Parties enabling the CHC Parties to file such document under seal or (y) portions of such filed documents mutually agreed upon by the CHC Parties and the Milestone Parties are redacted, and (iv) the professional advisors of the Committee on a confidential basis pursuant to a letter agreement entered into with the Committee acceptable to the CHC Parties and Milestone setting forth a protocol for disclosure including the information that can be disclosed generally to the Committee and the information that is subject to limited disclosure to only certain professional advisors to the Committee."
  - source_sentence: Anti-Assignment
    sentences:
      - Backhaul
      - >-
        This  agreement  may  not  be assigned or delegated by Affiliate
        without  prior  written  consent  from  Network  1.
      - >-
        HealthGate will liaise with the Publishers, making available
        for             such purposes such HealthGate liaison staff as the
        Publishers may             reasonably require, and acting in all good
        faith, to ensure a             mutually satisfactory license to the
        Publishers or, at the             Publishers' option, to a replacement
        contractor.
  - source_sentence: Notice Period To Terminate Renewal
    sentences:
      - >-
        After the initial period of two years, the maintenance and support      
        contract shall be automatically renewed for a period of one year on
        each       renewal date, unless one of the parties terminates the
        maintenance and       support contract through written notification to
        the other party in the       form of a registered letter with proof of
        receipt, at least six (6) weeks       prior to the renewal date.
      - >-
        Any Transfer  without such approval shall constitute a breach of this
        Agreement  and shall be void and of no  effect.
      - >-
        The Company shall do and perform, or cause to be done and performed, all
        such further acts and things, and shall execute and deliver all such
        other agreements, certificates, instruments and documents, as the MHR
        Funds may reasonably request in order to carry out the intent and
        accomplish the purposes of this Agreement and the consummation of the
        transactions contemplated hereby.
  - source_sentence: Governing Law
    sentences:
      - >-
        In addition, the limitations in Section 23.1(b) will not apply (1) to
        Company's indemnification obligations under Section 22.1(a) or (2)
        Allscripts indemnification obligations under Section 22.3(a), unless the
        Company's or Allscripts' indemnification obligation under Section
        22.1(a) or 22.3(a), as the case may be, relates to the losses and
        obligations described in subclauses (a) through (f) of the preceding
        sentence. [***].
      - >-
        THIS AGREEMENT SHALL BE GOVERNED BY AND CONSTRUED IN ACCORDANCE WITH THE
        INTERNAL LAWS OF THE STATE OF NEW YORK APPLICABLE TO AGREEMENTS MADE AND
        TO BE PERFORMED ENTIRELY WITHIN SUCH STATE, WITHOUT REGARD TO THE
        CONFLICTS OF LAW PRINCIPLES OF SUCH STATE OTHER THAN SECTIONS 5-1401 OF
        THE NEW YORK GENERAL






        OBLIGATIONS LAW.
      - >-
        All such records required to be created and maintained pursuant to
        Section 2.12(a) shall be kept available at the Operator's office and
        made available for the Owner's inspection upon request at all reasonable
        times.
  - source_sentence: License Grant
    sentences:
      - >-
        SIERRA  hereby  grants  ENVISION an          exclusive,  royalty-free 
        sub-license of  the Product's future patents,          and patent 
        applications  to distribute,  sell  and market the Finished         
        Product.
      - >-
        Aucta should continue to receive 15% of Net Sales Royalty for as long as
        ETON is selling the Product(s) in the Territory, unless otherwise agreed
        to under this Agreement.
      - >-
        In the event FCE notifies ExxonMobil that it has formally decided not to
        pursue Generation 2 Technology for Power Applications, then upon
        ExxonMobil's written request, FCE agrees to negotiate a grant to
        ExxonMobil and its Affiliates, under commercially reasonable terms to be
        determined in good faith, a worldwide, royalty-bearing (with the royalty
        to be negotiated), non-exclusive, sub-licensable right and license to
        practice FCE Background Information and FCE Background Patents for
        Generation 2 Technology in any application outside of Carbon Capture
        Applications and Hydrogen Applications.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'License Grant',
    "In the event FCE notifies ExxonMobil that it has formally decided not to pursue Generation 2 Technology for Power Applications, then upon ExxonMobil's written request, FCE agrees to negotiate a grant to ExxonMobil and its Affiliates, under commercially reasonable terms to be determined in good faith, a worldwide, royalty-bearing (with the royalty to be negotiated), non-exclusive, sub-licensable right and license to practice FCE Background Information and FCE Background Patents for Generation 2 Technology in any application outside of Carbon Capture Applications and Hydrogen Applications.",
    'Aucta should continue to receive 15% of Net Sales Royalty for as long as ETON is selling the Product(s) in the Territory, unless otherwise agreed to under this Agreement.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7920, 0.3253],
#         [0.7920, 1.0000, 0.4614],
#         [0.3253, 0.4614, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 16,129 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 3 tokens
    • mean: 54.18 tokens
    • max: 512 tokens
    • min: 3 tokens
    • mean: 95.75 tokens
    • max: 512 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    Parties STARTEC GLOBAL COMMUNICATIONS CORPORATION 1.0
    The proceeds of the Revolving Loans and the Swingline Loans, and the Letters of Credit, shall be used for general corporate purposes, including, but not limited to, repayment of any Indebtedness and to backstop the issuance of commercial paper. Use the proceeds of the Loans and the Letters of Credit only as contemplated in Section  3.12 . The Borrower will not request any Borrowing, and the Borrower shall not use, and shall procure that its Subsidiaries and its or their respective directors, officers, employees and agents shall not use, the proceeds of any Borrowing (a) in furtherance of an offer, payment, promise to pay, or authorization of the payment or giving of money, or anything else of value, to any Person in violation of any Anti-Corruption Laws in any material respect, (b) for the purpose of funding, financing or facilitating any unauthorized activities, business or transaction of or with any Sanctioned Person, or in any Sanctioned Country, or (c) knowingly in any manner that would result in the violation of any Sanctions Laws applicable to any party hereto. 1.0
    Governing Law state. 1.0
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • num_train_epochs: 1
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.0620 500 0.62
0.1240 1000 0.3153
0.1860 1500 0.2382

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.55.4
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}