SentenceTransformer based on Snowflake/snowflake-arctic-embed-m

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-m
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("mbudisic/snowflake-arctic-embed-m-ft-pstuts")
# Run inference
sentences = [
    'what option key do in photoshop zoom tool',
    "Zooming and panning are ways to navigate around an image that you'll use often as you work on images in Photoshop CC. To practice working with the zoom and pan controls, open this image from the tutorial practice files, or open a large image of your own. Zooming means changing the magnification of the image, as you might do if you were looking at the sky through a telescope. You may want to zoom in for a closer view of part of an image, or you may want to zoom out to see more of an image on your screen. The most straightforward way to zoom is to select the Zoom tool, toward the bottom of the Tools panel here. Then go up to the Options bar for the Zoom tool, where you'll find a plus icon for zooming in, and a minus icon for zooming out. Let's start with the plus icon activated which is the default. Then to zoom in, move into the image and click. And each time you click, you'll zoom in a little further. To zoom back out to see more of the image again, go back to the Options bar, and this time select the minus icon, and then click several times in the image to zoom back out. If you want to zoom in again, you have to go back to the Options bar, click the plus icon, and click in the image to zoom in again. Now you may get tired of going up to the Options bar every time you want to switch between zooming in and zooming out. So, here's a shortcut that will help you. When the zoom in option is active, as it is now, you can switch to zooming out by holding the Option key on your keyboard if you're on a Mac, or the ALT key on Windows. Hold down that key and then click in the image. And that will automatically switch you back to zooming out. Then release your finger from the Option or ALT key, and you're switched back to zooming in. And so, you can click in the image to zoom in again. The Zoom tool has a couple of options in its Options bar, that you can use to move quickly to zoom levels that you use often. The Fit Screen option, here in the Options bar, comes in handy when you're zoomed in like this and you want to get back to a view of the entire image. Just click the Fit Screen option, and the entire image fits itself into your document window. Another useful option is this 100% option. Clicking this, zooms you into 100% view of the image, which is the best way to view an image when you're checking it for sharpness. Now, I'm working on a small screen and this image is pretty large, so when I zoom in to 100%, I can't see the whole image on my screen. Although you may not experience the same thing if you're working on a large monitor. So, if I want to see a different part of this image at this zoom level, I'm going to need to move the image around in my document window. That's called panning. And it's done with another tool, the Hand tool. So, I'm going to go back to the Tools panel, and I'm going to select the Hand tool there, which is just above the Zoom tool. Then I'll move into the image, and notice that my cursor is now changed to a hand icon. I'll click, drag, and move the image in the document window, to a place that I want to see, and then I'll release my mouse. When I'm done checking the sharpness here and I want to go back to view the entire image on screen, I'll go up to the Options bar for the Hand tool, and there I'll see the same Fit Screen option that we had for the Zoom tool. So, I can just click Fit Screen in the Hand tool Options bar, and that takes me back to see the entire image in my document window. Let me show you another way to zoom. Instead of clicking, you can do continuous zoom by holding your mouse down on the image. I'll go back and get the Zoom tool in the Tools panel. And then I'm going to click and hold in the image. And the image zooms in continuously. If you zoom in really far like this, you can see the pixels, that are the building blocks of an image in Photoshop CC. By the way, the size of these pixels can affect the image quality of a print, which is why image resolution is an important topic, especially for printing. Something we'll talk more about when we cover resizing an image later in this series. I'm going to go up to the Options bar and click Fit Screen, so I can see the entire image on my screen again. One more thing, let's say that you're working with another tool, maybe the Brush tool, and you're painting in a small area and you don't want to switch out of the Brush tool over to the Zoom tool just to zoom. Well, there's a shortcut that you can use instead of the Zoom tool. And that is to hold the Command key on a Mac, or the Ctrl key on a PC, as you press the plus key on your keyboard. And every time you do that, that will zoom you in. If you want to zoom back out, hold the Command key on a Mac or the Ctrl key on a PC, and press the minus key on your keyboard. And that will zoom you back out. So, that's an introduction to zooming and panning, that I hope will help you to navigate your images as you're working on them in Photoshop CC. To finish up with this lesson, you can",
    '<2-hop>\n\nwasn\'t perfectly vertical, it is now, but the other part is in Perspective, watch this. If I come down here and move this point left or right, it locks the top one in and allows me to literally change the perspective where the photograph was taken. And if you look closely, it\'s not just stretching the stuff out. it\'s working with complex algorithms to decide what it would have looked like if the photographer had changed positions. That\'s why I like this thing so much. It gives you total control over what you\'re doing. Now you can even change it by foreshortening it, whatever you want to do to change what you think this image will be and understand we\'re not talking about a formula where it\'s got to be 27 degrees because of this, that, or the other. We\'re talking about you. You are the photographer. You are the designer. What looks good to you? Now let\'s say you like that. You can set it by clicking right here or you can say, "I wish I\'d never done anything," and click here to get out of it. Well, there you go. If I turn it on and off, you can see the before and the after. Perspective Warp is an amazing tool, brand new to Photoshop CC, literally allowing you to get back into the third dimension and change the position of where the photograph was taken. Well, that\'s about it. This is Andy Anderson saying, "Keep learning, and don\'t forget, guys, to make sure you check out the other videos on our Creative Cloud learning site."',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.05
cosine_accuracy@3 0.55
cosine_accuracy@5 0.75
cosine_accuracy@10 1.0
cosine_precision@1 0.05
cosine_precision@3 0.1833
cosine_precision@5 0.2
cosine_precision@10 0.15
cosine_recall@1 0.05
cosine_recall@3 0.325
cosine_recall@5 0.625
cosine_recall@10 1.0
cosine_ndcg@10 0.4975
cosine_mrr@10 0.3171
cosine_map@100 0.3247

Training Details

Training Dataset

Unnamed Dataset

  • Size: 90 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 90 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 5 tokens
    • mean: 34.76 tokens
    • max: 56 tokens
    • min: 49 tokens
    • mean: 374.26 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1
    who andy anderson wasn't perfectly vertical, it is now, but the other part is in Perspective, watch this. If I come down here and move this point left or right, it locks the top one in and allows me to literally change the perspective where the photograph was taken. And if you look closely, it's not just stretching the stuff out. it's working with complex algorithms to decide what it would have looked like if the photographer had changed positions. That's why I like this thing so much. It gives you total control over what you're doing. Now you can even change it by foreshortening it, whatever you want to do to change what you think this image will be and understand we're not talking about a formula where it's got to be 27 degrees because of this, that, or the other. We're talking about you. You are the photographer. You are the designer. What looks good to you? Now let's say you like that. You can set it by clicking right here or you can say, "I wish I'd never done anything," and click here to get out o...
    wut is Adobee Photoshoop Cretive Cloud? >> What I want to show you in this video is something that is absolutely amazing. It's a brand new feature in Adobe Photoshop Creative Cloud called Perspective Warp. Now I have a photograph open. I didn't take this photo. It was taken by a company called PhotoSpin. And don't forget if you want to follow along, you can download the assets for this video. What I want to do first though is make a copy of it. I'm going to drag it down- this is one way to do it-make a copy. That is not necessary, but this way we get to see kind of a before and an after. Now it will work with just about any image, but your first test is to go up to the word Edit on the pull-down menu and go down, and you better see Perspective Warp. If you don't, no big deal. Just go out to the cloud, and download the latest version of Photoshop. Now what does it do? What does Perspective Warp do? It literally allows me to re-enter a three-dimensional world to change the perspective of the image as if, as the photographer, I...
    How can the ALT key be used as a shortcut when zooming with the Zoom tool in Photoshop CC? Zooming and panning are ways to navigate around an image that you'll use often as you work on images in Photoshop CC. To practice working with the zoom and pan controls, open this image from the tutorial practice files, or open a large image of your own. Zooming means changing the magnification of the image, as you might do if you were looking at the sky through a telescope. You may want to zoom in for a closer view of part of an image, or you may want to zoom out to see more of an image on your screen. The most straightforward way to zoom is to select the Zoom tool, toward the bottom of the Tools panel here. Then go up to the Options bar for the Zoom tool, where you'll find a plus icon for zooming in, and a minus icon for zooming out. Let's start with the plus icon activated which is the default. Then to zoom in, move into the image and click. And each time you click, you'll zoom in a little further. To zoom back out to see more of the image again, go back to the Options bar, and this...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            384,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 1
  • per_device_eval_batch_size: 1
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 1
  • per_device_eval_batch_size: 1
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss cosine_ndcg@10
0 0 - 0.4975
0.1111 10 - 0.4975
0.2222 20 - 0.4975
0.3333 30 - 0.4975
0.4444 40 - 0.4975
0.5556 50 - 0.4975
0.6667 60 - 0.4975
0.7778 70 - 0.4975
0.8889 80 - 0.4975
1.0 90 - 0.4975
1.1111 100 - 0.4975
1.2222 110 - 0.4975
1.3333 120 - 0.4975
1.4444 130 - 0.4975
1.5556 140 - 0.4975
1.6667 150 - 0.4975
1.7778 160 - 0.4975
1.8889 170 - 0.4975
2.0 180 - 0.4975
2.1111 190 - 0.4975
2.2222 200 - 0.4975
2.3333 210 - 0.4975
2.4444 220 - 0.4975
2.5556 230 - 0.4975
2.6667 240 - 0.4975
2.7778 250 - 0.4975
2.8889 260 - 0.4975
3.0 270 - 0.4975
3.1111 280 - 0.4975
3.2222 290 - 0.4975
3.3333 300 - 0.4975
3.4444 310 - 0.4975
3.5556 320 - 0.4975
3.6667 330 - 0.4975
3.7778 340 - 0.4975
3.8889 350 - 0.4975
4.0 360 - 0.4975
4.1111 370 - 0.4975
4.2222 380 - 0.4975
4.3333 390 - 0.4975
4.4444 400 - 0.4975
4.5556 410 - 0.4975
4.6667 420 - 0.4975
4.7778 430 - 0.4975
4.8889 440 - 0.4975
5.0 450 - 0.4975
5.1111 460 - 0.4975
5.2222 470 - 0.4975
5.3333 480 - 0.4975
5.4444 490 - 0.4975
5.5556 500 0.0 0.4975
5.6667 510 - 0.4975
5.7778 520 - 0.4975
5.8889 530 - 0.4975
6.0 540 - 0.4975
6.1111 550 - 0.4975
6.2222 560 - 0.4975
6.3333 570 - 0.4975
6.4444 580 - 0.4975
6.5556 590 - 0.4975
6.6667 600 - 0.4975
6.7778 610 - 0.4975
6.8889 620 - 0.4975
7.0 630 - 0.4975
7.1111 640 - 0.4975
7.2222 650 - 0.4975
7.3333 660 - 0.4975
7.4444 670 - 0.4975
7.5556 680 - 0.4975
7.6667 690 - 0.4975
7.7778 700 - 0.4975
7.8889 710 - 0.4975
8.0 720 - 0.4975
8.1111 730 - 0.4975
8.2222 740 - 0.4975
8.3333 750 - 0.4975
8.4444 760 - 0.4975
8.5556 770 - 0.4975
8.6667 780 - 0.4975
8.7778 790 - 0.4975
8.8889 800 - 0.4975
9.0 810 - 0.4975
9.1111 820 - 0.4975
9.2222 830 - 0.4975
9.3333 840 - 0.4975
9.4444 850 - 0.4975
9.5556 860 - 0.4975
9.6667 870 - 0.4975
9.7778 880 - 0.4975
9.8889 890 - 0.4975
10.0 900 - 0.4975

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.7.0+cu126
  • Accelerate: 1.7.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
14
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mbudisic/snowflake-arctic-embed-m-ft-pstuts

Finetuned
(50)
this model

Evaluation results