SentenceTransformer based on Snowflake/snowflake-arctic-embed-s

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-s. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-s
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("mbudisic/snowflake-arctic-embed-s-ft-pstuts")
# Run inference
sentences = [
    'How I use Brush tool and zoom in same time, I dont wanna keep switchin tools?',
    "Zooming and panning are ways to navigate around an image that you'll use often as you work on images in Photoshop CC. To practice working with the zoom and pan controls, open this image from the tutorial practice files, or open a large image of your own. Zooming means changing the magnification of the image, as you might do if you were looking at the sky through a telescope. You may want to zoom in for a closer view of part of an image, or you may want to zoom out to see more of an image on your screen. The most straightforward way to zoom is to select the Zoom tool, toward the bottom of the Tools panel here. Then go up to the Options bar for the Zoom tool, where you'll find a plus icon for zooming in, and a minus icon for zooming out. Let's start with the plus icon activated which is the default. Then to zoom in, move into the image and click. And each time you click, you'll zoom in a little further. To zoom back out to see more of the image again, go back to the Options bar, and this time select the minus icon, and then click several times in the image to zoom back out. If you want to zoom in again, you have to go back to the Options bar, click the plus icon, and click in the image to zoom in again. Now you may get tired of going up to the Options bar every time you want to switch between zooming in and zooming out. So, here's a shortcut that will help you. When the zoom in option is active, as it is now, you can switch to zooming out by holding the Option key on your keyboard if you're on a Mac, or the ALT key on Windows. Hold down that key and then click in the image. And that will automatically switch you back to zooming out. Then release your finger from the Option or ALT key, and you're switched back to zooming in. And so, you can click in the image to zoom in again. The Zoom tool has a couple of options in its Options bar, that you can use to move quickly to zoom levels that you use often. The Fit Screen option, here in the Options bar, comes in handy when you're zoomed in like this and you want to get back to a view of the entire image. Just click the Fit Screen option, and the entire image fits itself into your document window. Another useful option is this 100% option. Clicking this, zooms you into 100% view of the image, which is the best way to view an image when you're checking it for sharpness. Now, I'm working on a small screen and this image is pretty large, so when I zoom in to 100%, I can't see the whole image on my screen. Although you may not experience the same thing if you're working on a large monitor. So, if I want to see a different part of this image at this zoom level, I'm going to need to move the image around in my document window. That's called panning. And it's done with another tool, the Hand tool. So, I'm going to go back to the Tools panel, and I'm going to select the Hand tool there, which is just above the Zoom tool. Then I'll move into the image, and notice that my cursor is now changed to a hand icon. I'll click, drag, and move the image in the document window, to a place that I want to see, and then I'll release my mouse. When I'm done checking the sharpness here and I want to go back to view the entire image on screen, I'll go up to the Options bar for the Hand tool, and there I'll see the same Fit Screen option that we had for the Zoom tool. So, I can just click Fit Screen in the Hand tool Options bar, and that takes me back to see the entire image in my document window. Let me show you another way to zoom. Instead of clicking, you can do continuous zoom by holding your mouse down on the image. I'll go back and get the Zoom tool in the Tools panel. And then I'm going to click and hold in the image. And the image zooms in continuously. If you zoom in really far like this, you can see the pixels, that are the building blocks of an image in Photoshop CC. By the way, the size of these pixels can affect the image quality of a print, which is why image resolution is an important topic, especially for printing. Something we'll talk more about when we cover resizing an image later in this series. I'm going to go up to the Options bar and click Fit Screen, so I can see the entire image on my screen again. One more thing, let's say that you're working with another tool, maybe the Brush tool, and you're painting in a small area and you don't want to switch out of the Brush tool over to the Zoom tool just to zoom. Well, there's a shortcut that you can use instead of the Zoom tool. And that is to hold the Command key on a Mac, or the Ctrl key on a PC, as you press the plus key on your keyboard. And every time you do that, that will zoom you in. If you want to zoom back out, hold the Command key on a Mac or the Ctrl key on a PC, and press the minus key on your keyboard. And that will zoom you back out. So, that's an introduction to zooming and panning, that I hope will help you to navigate your images as you're working on them in Photoshop CC. To finish up with this lesson, you can",
    '<2-hop>\n\nbit and make that bird look like it was in the original image. So as you can see, refining images is easier than ever using the Content Aware Fill, the Patch Tool, and the Content Aware Move Tool in Photoshop CC.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.65
cosine_accuracy@3 0.75
cosine_accuracy@5 0.8
cosine_accuracy@10 1.0
cosine_precision@1 0.65
cosine_precision@3 0.3833
cosine_precision@5 0.25
cosine_precision@10 0.15
cosine_recall@1 0.45
cosine_recall@3 0.725
cosine_recall@5 0.775
cosine_recall@10 1.0
cosine_ndcg@10 0.7904
cosine_mrr@10 0.7435
cosine_map@100 0.711

Training Details

Training Dataset

Unnamed Dataset

  • Size: 90 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 90 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 5 tokens
    • mean: 34.76 tokens
    • max: 56 tokens
    • min: 49 tokens
    • mean: 374.26 tokens
    • max: 512 tokens
    • min: 0.5
    • mean: 0.83
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    How can a beginner Photoshop user utilize the Lasso Tool to remove distracting elements from an image? >> Removing distracting elements is easier than ever with the 2014 release of Photoshop CC. Not only has the Content Aware technology improved in terms of quality, but the speed has also been improved significantly. Let's go ahead take a look at a few examples. Now in this image of the cactus, I'm going to select the Lasso Tool and I simply want to get rid of this cactus here in the background. The easiest way to do this would be to select Edit and then Fill. I'll use Content Aware, but when I click OK, you'll notice there's a little bit of a seam here where the colors aren't blended very well. And that's because, in the past, the Content Aware technology really focused on the texture in the image. So let's undo that Command + Z or Ctrl + Z. And this time when I select Edit, Fill, I'm going to use the new Color Adaptation option. This time when I click OK, you can see that Photoshop does a much better job in blending those colors. Now let's go ahead and try to remove the other two cact... 1.0
    How can a beginner use the Perspective Warp feature in Adobe Photoshop to manipulate perspective, such as straightening buildings and changing the viewpoint, and what steps and controls are involved in this process according to the provided guide? <1-hop>

    >> What I want to show you in this video is something that is absolutely amazing. It's a brand new feature in Adobe Photoshop Creative Cloud called Perspective Warp. Now I have a photograph open. I didn't take this photo. It was taken by a company called PhotoSpin. And don't forget if you want to follow along, you can download the assets for this video. What I want to do first though is make a copy of it. I'm going to drag it down- this is one way to do it-make a copy. That is not necessary, but this way we get to see kind of a before and an after. Now it will work with just about any image, but your first test is to go up to the word Edit on the pull-down menu and go down, and you better see Perspective Warp. If you don't, no big deal. Just go out to the cloud, and download the latest version of Photoshop. Now what does it do? What does Perspective Warp do? It literally allows me to re-enter a three-dimensional world to change the perspective of the image as if, as the photog...
    1.0
    Wht is the funtion of Ctrl + Z in Photoshp when you make a mistake while using Content Aware or Patch Tool, and how can beginners use it to undo their last action step by step? >> Removing distracting elements is easier than ever with the 2014 release of Photoshop CC. Not only has the Content Aware technology improved in terms of quality, but the speed has also been improved significantly. Let's go ahead take a look at a few examples. Now in this image of the cactus, I'm going to select the Lasso Tool and I simply want to get rid of this cactus here in the background. The easiest way to do this would be to select Edit and then Fill. I'll use Content Aware, but when I click OK, you'll notice there's a little bit of a seam here where the colors aren't blended very well. And that's because, in the past, the Content Aware technology really focused on the texture in the image. So let's undo that Command + Z or Ctrl + Z. And this time when I select Edit, Fill, I'm going to use the new Color Adaptation option. This time when I click OK, you can see that Photoshop does a much better job in blending those colors. Now let's go ahead and try to remove the other two cact... 1.0
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            384,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • num_train_epochs: 50
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 50
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss cosine_ndcg@10
0.8333 10 - 0.7208
1.0 12 - 0.7208
1.6667 20 - 0.7311
2.0 24 - 0.7399
2.5 30 - 0.7447
3.0 36 - 0.7322
3.3333 40 - 0.7450
4.0 48 - 0.7416
4.1667 50 - 0.7262
5.0 60 - 0.7581
5.8333 70 - 0.7359
6.0 72 - 0.7455
6.6667 80 - 0.6918
7.0 84 - 0.6784
7.5 90 - 0.7130
8.0 96 - 0.7744
8.3333 100 - 0.7709
9.0 108 - 0.7617
9.1667 110 - 0.7657
10.0 120 - 0.7382
10.8333 130 - 0.7143
11.0 132 - 0.7143
11.6667 140 - 0.7433
12.0 144 - 0.7433
12.5 150 - 0.7433
13.0 156 - 0.7497
13.3333 160 - 0.7680
14.0 168 - 0.7270
14.1667 170 - 0.7276
15.0 180 - 0.7402
15.8333 190 - 0.7212
16.0 192 - 0.7212
16.6667 200 - 0.7296
17.0 204 - 0.6978
17.5 210 - 0.7193
18.0 216 - 0.7271
18.3333 220 - 0.7206
19.0 228 - 0.7306
19.1667 230 - 0.7306
20.0 240 - 0.7459
20.8333 250 - 0.7515
21.0 252 - 0.7494
21.6667 260 - 0.7833
22.0 264 - 0.7793
22.5 270 - 0.8032
23.0 276 - 0.7782
23.3333 280 - 0.7782
24.0 288 - 0.7782
24.1667 290 - 0.7668
25.0 300 - 0.7782
25.8333 310 - 0.7795
26.0 312 - 0.7795
26.6667 320 - 0.7820
27.0 324 - 0.7820
27.5 330 - 0.7841
28.0 336 - 0.7915
28.3333 340 - 0.7860
29.0 348 - 0.7832
29.1667 350 - 0.7897
30.0 360 - 0.7984
30.8333 370 - 0.7854
31.0 372 - 0.7839
31.6667 380 - 0.7709
32.0 384 - 0.7688
32.5 390 - 0.7681
33.0 396 - 0.7673
33.3333 400 - 0.7453
34.0 408 - 0.7638
34.1667 410 - 0.7638
35.0 420 - 0.7751
35.8333 430 - 0.7617
36.0 432 - 0.7617
36.6667 440 - 0.7652
37.0 444 - 0.7643
37.5 450 - 0.7658
38.0 456 - 0.7708
38.3333 460 - 0.7893
39.0 468 - 0.7706
39.1667 470 - 0.7706
40.0 480 - 0.7706
40.8333 490 - 0.7654
41.0 492 - 0.7654
41.6667 500 3.9103 0.7654
42.0 504 - 0.7654
42.5 510 - 0.7720
43.0 516 - 0.7904
43.3333 520 - 0.7904
44.0 528 - 0.7904
44.1667 530 - 0.7904
45.0 540 - 0.7904
45.8333 550 - 0.7904
46.0 552 - 0.7904
46.6667 560 - 0.7904
47.0 564 - 0.7904
47.5 570 - 0.7904
48.0 576 - 0.7904
48.3333 580 - 0.7904
49.0 588 - 0.7904
49.1667 590 - 0.7904
50.0 600 - 0.7904

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.7.0+cu126
  • Accelerate: 1.7.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
72
Safetensors
Model size
33.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mbudisic/snowflake-arctic-embed-s-ft-pstuts

Finetuned
(6)
this model

Space using mbudisic/snowflake-arctic-embed-s-ft-pstuts 1

Evaluation results