sud-962081's picture
Add new SentenceTransformer model
f0e9eed verified
metadata
language:
  - en
license: apache-2.0
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:882
  - loss:MatryoshkaLoss
  - loss:TripletLoss
base_model: BAAI/bge-base-en-v1.5
widget:
  - source_sentence: >-
      dataset = Dataset(
          name="dataset_with_metadata",
          settings=Settings(
              fields=[TextField(name="text")],
              questions=[LabelQuestion(name="label", labels=["positive", "negative"])],
              vectors=[
                  VectorField(name="vector_name"),
              ],
          ),
      )

      dataset.create()

      ```


      Then, you can add records to the dataset with vectors that correspond to
      the vector field defined in the dataset settings:
    sentences:
      - >-
        The beautiful scenery of the Italian countryside inspired the artist to
        create a stunning watercolor painting of the Argilla region.
      - >-
        Can a RankingQuestion be used to determine the most relevant features
        for a machine learning model in Argilla?
      - >-
        How can I go about creating a custom vector field in Argilla to store
        vector data?
  - source_sentence: >-
      description: In this section, we will provide a step-by-step guide to show
      how to manage datasets and configure dataset settings.


      Dataset management


      This guide provides an overview of datasets, explaining the basics of how
      to set them up and manage them in Argilla.
    sentences:
      - >-
        The new restaurant in town offers a wide variety of dishes, but the menu
        settings need to be reconfigured for a better dining experience.
      - >-
        Can Argilla provide guidance on setting up dataset parameters to ensure
        efficient data preparation and annotation for machine learning models?
      - >-
        How can Argilla help in creating a vector field by initializing a
        VectorField object with a specific name and dimensionality?
  - source_sentence: >-
      Delete a user


      You can delete an existing user from Argilla by calling the delete method
      on the User class.


      ```python

      import argilla_sdk as rg


      client = rg.Argilla(api_url="", api_key="")


      user_to_delete = client.users('my_username')


      deleted_user = user_to_delete.delete()

      ```
    sentences:
      - The beach was empty, so I decided to delete my footprint from the sand.
      - >-
        How can I traverse the Argilla dataset to pull out specific information,
        like response values and user IDs, from the records and responses?
      - >-
        How do I go about deleting a user from Argilla using the delete method
        provided by the User class?
  - source_sentence: >-
      Iterating over records with suggestions


      Just like responses, suggestions can be accessed from a Record via their
      question name as an attribute of the record. So if a question is named
      label, the suggestion can be accessed as record.label. The following
      example demonstrates how to access suggestions from a record object:


      python

      for record in dataset.records(with_suggestions=True):
          print(record.suggestions.label)

      Class Reference
    sentences:
      - >-
        Can we iterate over records with suggestions to access their question
        names as attributes?
      - >-
        The new cafe in town is iterating over records of their customer
        feedback to improve their coffee flavors.
      - >-
        Is it possible for Argilla to optimize my workflow for monitoring LLM
        pipelines and A/B testing of classification models, specifically for
        spans and text?
  - source_sentence: >-
      Make changes and push them


      Make the changes you want in your local repository, and test that
      everything works and you are following the guidelines. Check the
      documentation for more information about the development.


      Once you have finished, you can check the status of your repository and
      synchronize with the upstreaming repo with the following command:


      ```sh


      Check the status of your repository


      git status


      Synchronize with the upstreaming repo
    sentences:
      - >-
        Are changes required to be made and then uploaded to the Argilla dataset
        repository?
      - >-
        Can I use Argilla's filtering feature to narrow down the search results
        based on specific conditions?
      - >-
        The beautiful scenery of the Italian town Argilla made me want to make
        changes to my travel plans.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: BGE base ArgillaSDK Matryoshka
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.061224489795918366
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.1836734693877551
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.2653061224489796
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.29591836734693877
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.061224489795918366
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.06122448979591836
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.05306122448979592
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.029591836734693875
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.061224489795918366
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.1836734693877551
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.2653061224489796
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.29591836734693877
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.17877743107362734
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.1409013605442177
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.15403248573620684
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.07142857142857142
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.1836734693877551
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.25510204081632654
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.30612244897959184
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.07142857142857142
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.06122448979591836
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.051020408163265314
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03061224489795919
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.07142857142857142
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.1836734693877551
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.25510204081632654
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.30612244897959184
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.17888633702518553
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.13887269193391644
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.14994011261768814
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.04081632653061224
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.20408163265306123
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.25510204081632654
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.29591836734693877
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.04081632653061224
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.06802721088435373
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.05102040816326531
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.029591836734693875
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.04081632653061224
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.20408163265306123
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.25510204081632654
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.29591836734693877
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.16359512996570535
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.12111273080660837
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.13486861555858168
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.030612244897959183
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.19387755102040816
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.24489795918367346
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.37755102040816324
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.030612244897959183
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.06462585034013606
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.04897959183673469
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03775510204081633
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.030612244897959183
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.19387755102040816
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.24489795918367346
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.37755102040816324
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.18446318338912313
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.12522270813087136
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.13299376323790885
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.02040816326530612
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.08163265306122448
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.21428571428571427
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.2755102040816326
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.02040816326530612
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.027210884353741496
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.04285714285714286
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.027551020408163266
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.02040816326530612
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.08163265306122448
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.21428571428571427
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.2755102040816326
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.13201826308152928
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.08740686750890833
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.10010997553556797
            name: Cosine Map@100

BGE base ArgillaSDK Matryoshka

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sud-962081/bge-base-argilla-sdk-matryoshka")
# Run inference
sentences = [
    'Make changes and push them\n\nMake the changes you want in your local repository, and test that everything works and you are following the guidelines. Check the documentation for more information about the development.\n\nOnce you have finished, you can check the status of your repository and synchronize with the upstreaming repo with the following command:\n\n```sh\n\nCheck the status of your repository\n\ngit status\n\nSynchronize with the upstreaming repo',
    'Are changes required to be made and then uploaded to the Argilla dataset repository?',
    'The beautiful scenery of the Italian town Argilla made me want to make changes to my travel plans.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric dim_768 dim_512 dim_256 dim_128 dim_64
cosine_accuracy@1 0.0612 0.0714 0.0408 0.0306 0.0204
cosine_accuracy@3 0.1837 0.1837 0.2041 0.1939 0.0816
cosine_accuracy@5 0.2653 0.2551 0.2551 0.2449 0.2143
cosine_accuracy@10 0.2959 0.3061 0.2959 0.3776 0.2755
cosine_precision@1 0.0612 0.0714 0.0408 0.0306 0.0204
cosine_precision@3 0.0612 0.0612 0.068 0.0646 0.0272
cosine_precision@5 0.0531 0.051 0.051 0.049 0.0429
cosine_precision@10 0.0296 0.0306 0.0296 0.0378 0.0276
cosine_recall@1 0.0612 0.0714 0.0408 0.0306 0.0204
cosine_recall@3 0.1837 0.1837 0.2041 0.1939 0.0816
cosine_recall@5 0.2653 0.2551 0.2551 0.2449 0.2143
cosine_recall@10 0.2959 0.3061 0.2959 0.3776 0.2755
cosine_ndcg@10 0.1788 0.1789 0.1636 0.1845 0.132
cosine_mrr@10 0.1409 0.1389 0.1211 0.1252 0.0874
cosine_map@100 0.154 0.1499 0.1349 0.133 0.1001

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 882 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 882 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 91.86 tokens
    • max: 198 tokens
    • min: 8 tokens
    • mean: 25.62 tokens
    • max: 91 tokens
    • min: 10 tokens
    • mean: 22.11 tokens
    • max: 61 tokens
  • Samples:
    anchor positive negative
    workspace = client.workspaces("my_workspace")

    Retrieve the dataset from the first workspace

    retrieved_dataset = client.datasets(name="my_dataset")

    Retrieve the dataset from the specified workspace

    retrieved_dataset = client.datasets(name="my_dataset", workspace=workspace)
    <br><br>Check dataset existence<br><br>You can check if a dataset exists by calling the exists method on the Dataset class. This method returns a boolean value.<br><br>python
    import argilla_sdk as rg
    Is there a way to download a dataset from a specific workspace using the Argilla client for my data annotation task? The new coffee shop in town offers a variety of workspace options for remote workers.
    === "As Record objects"
    You can also add suggestions to a record in an initializedRecord` object.

    === "From a generic data structure"
    You can add suggestions as a dictionary, where the keys correspond to the names of the labels that were configured for your dataset. Remember that you can also use the mapping parameter to specify the data structure.
    Is it possible to associate multiple suggestions with a single record object in Argilla? I love adding suggestions to my garden to make it look more beautiful.
    hide: footer

    rg.Argilla

    To interact with the Argilla server from python you can use the Argilla class. The Argilla client is used to create, get, update, and delete all Argilla resources, such as workspaces, users, datasets, and records.

    Usage Examples

    Connecting to an Argilla server

    To connect to an Argilla server, instantiate the Argilla class and pass the api_url of the server and the api_key to authenticate.

    ```python
    import argilla_sdk as rg
    Does the Argilla class provide a convenient way to handle dataset and record administration tasks on the Argilla server? The tourists got lost in the Argilla desert because they forgot to bring a map.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "TripletLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_eval_batch_size: 4
  • gradient_accumulation_steps: 4
  • learning_rate: 2e-05
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0 0 - 0.3815 0.3810 0.3717 0.3897 0.3153
0.1802 5 23.2127 - - - - -
0.3604 10 22.567 - - - - -
0.5405 15 21.0403 - - - - -
0.7207 20 19.6983 - - - - -
0.9009 25 18.4465 - - - - -
0.973 27 - 0.2707 0.2832 0.2721 0.2576 0.238
1.1081 30 19.4241 - - - - -
1.2883 35 17.3167 - - - - -
1.4685 40 17.0334 - - - - -
1.6486 45 16.9455 - - - - -
1.8288 50 16.8353 - - - - -
1.9730 54 - 0.1507 0.1536 0.1595 0.1604 0.1532
2.0360 55 18.4414 - - - - -
2.2162 60 16.7065 - - - - -
2.3964 65 16.6709 - - - - -
2.5766 70 16.6449 - - - - -
2.7568 75 16.6349 - - - - -
2.9369 80 16.633 - - - - -
2.9730 81 - 0.1788 0.1789 0.1636 0.1845 0.1320
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}