SentenceTransformer based on Snowflake/snowflake-arctic-embed-s

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-s. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: Snowflake/snowflake-arctic-embed-s
Maximum Sequence Length: 512 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("mbudisic/snowflake-arctic-embed-s-ft-pstuts")
# Run inference
sentences = [
    'How I use Brush tool and zoom in same time, I dont wanna keep switchin tools?',
    "Zooming and panning are ways to navigate around an image that you'll use often as you work on images in Photoshop CC. To practice working with the zoom and pan controls, open this image from the tutorial practice files, or open a large image of your own. Zooming means changing the magnification of the image, as you might do if you were looking at the sky through a telescope. You may want to zoom in for a closer view of part of an image, or you may want to zoom out to see more of an image on your screen. The most straightforward way to zoom is to select the Zoom tool, toward the bottom of the Tools panel here. Then go up to the Options bar for the Zoom tool, where you'll find a plus icon for zooming in, and a minus icon for zooming out. Let's start with the plus icon activated which is the default. Then to zoom in, move into the image and click. And each time you click, you'll zoom in a little further. To zoom back out to see more of the image again, go back to the Options bar, and this time select the minus icon, and then click several times in the image to zoom back out. If you want to zoom in again, you have to go back to the Options bar, click the plus icon, and click in the image to zoom in again. Now you may get tired of going up to the Options bar every time you want to switch between zooming in and zooming out. So, here's a shortcut that will help you. When the zoom in option is active, as it is now, you can switch to zooming out by holding the Option key on your keyboard if you're on a Mac, or the ALT key on Windows. Hold down that key and then click in the image. And that will automatically switch you back to zooming out. Then release your finger from the Option or ALT key, and you're switched back to zooming in. And so, you can click in the image to zoom in again. The Zoom tool has a couple of options in its Options bar, that you can use to move quickly to zoom levels that you use often. The Fit Screen option, here in the Options bar, comes in handy when you're zoomed in like this and you want to get back to a view of the entire image. Just click the Fit Screen option, and the entire image fits itself into your document window. Another useful option is this 100% option. Clicking this, zooms you into 100% view of the image, which is the best way to view an image when you're checking it for sharpness. Now, I'm working on a small screen and this image is pretty large, so when I zoom in to 100%, I can't see the whole image on my screen. Although you may not experience the same thing if you're working on a large monitor. So, if I want to see a different part of this image at this zoom level, I'm going to need to move the image around in my document window. That's called panning. And it's done with another tool, the Hand tool. So, I'm going to go back to the Tools panel, and I'm going to select the Hand tool there, which is just above the Zoom tool. Then I'll move into the image, and notice that my cursor is now changed to a hand icon. I'll click, drag, and move the image in the document window, to a place that I want to see, and then I'll release my mouse. When I'm done checking the sharpness here and I want to go back to view the entire image on screen, I'll go up to the Options bar for the Hand tool, and there I'll see the same Fit Screen option that we had for the Zoom tool. So, I can just click Fit Screen in the Hand tool Options bar, and that takes me back to see the entire image in my document window. Let me show you another way to zoom. Instead of clicking, you can do continuous zoom by holding your mouse down on the image. I'll go back and get the Zoom tool in the Tools panel. And then I'm going to click and hold in the image. And the image zooms in continuously. If you zoom in really far like this, you can see the pixels, that are the building blocks of an image in Photoshop CC. By the way, the size of these pixels can affect the image quality of a print, which is why image resolution is an important topic, especially for printing. Something we'll talk more about when we cover resizing an image later in this series. I'm going to go up to the Options bar and click Fit Screen, so I can see the entire image on my screen again. One more thing, let's say that you're working with another tool, maybe the Brush tool, and you're painting in a small area and you don't want to switch out of the Brush tool over to the Zoom tool just to zoom. Well, there's a shortcut that you can use instead of the Zoom tool. And that is to hold the Command key on a Mac, or the Ctrl key on a PC, as you press the plus key on your keyboard. And every time you do that, that will zoom you in. If you want to zoom back out, hold the Command key on a Mac or the Ctrl key on a PC, and press the minus key on your keyboard. And that will zoom you back out. So, that's an introduction to zooming and panning, that I hope will help you to navigate your images as you're working on them in Photoshop CC. To finish up with this lesson, you can",
    '<2-hop>\n\nbit and make that bird look like it was in the original image. So as you can see, refining images is easier than ever using the Content Aware Fill, the Patch Tool, and the Content Aware Move Tool in Photoshop CC.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.65
cosine_accuracy@3	0.75
cosine_accuracy@5	0.8
cosine_accuracy@10	1.0
cosine_precision@1	0.65
cosine_precision@3	0.3833
cosine_precision@5	0.25
cosine_precision@10	0.15
cosine_recall@1	0.45
cosine_recall@3	0.725
cosine_recall@5	0.775
cosine_recall@10	1.0
cosine_ndcg@10	0.7904
cosine_mrr@10	0.7435
cosine_map@100	0.711

Training Details

Training Dataset

Unnamed Dataset

Size: 90 training samples
Columns: sentence_0, sentence_1, and label
Approximate statistics based on the first 90 samples:
sentence_0 sentence_1 label
type string string float
details
min: 5 tokens
mean: 34.76 tokens
max: 56 tokens

min: 49 tokens
mean: 374.26 tokens
max: 512 tokens

min: 0.5
mean: 0.83
max: 1.0

	sentence_0	sentence_1	label
type	string	string	float
details	min: 5 tokens mean: 34.76 tokens max: 56 tokens	min: 49 tokens mean: 374.26 tokens max: 512 tokens	min: 0.5 mean: 0.83 max: 1.0

Samples:

sentence_0	sentence_1	label
`How can a beginner Photoshop user utilize the Lasso Tool to remove distracting elements from an image?`	>> Removing distracting elements is easier than ever with the 2014 release of Photoshop CC. Not only has the Content Aware technology improved in terms of quality, but the speed has also been improved significantly. Let's go ahead take a look at a few examples. Now in this image of the cactus, I'm going to select the Lasso Tool and I simply want to get rid of this cactus here in the background. The easiest way to do this would be to select Edit and then Fill. I'll use Content Aware, but when I click OK, you'll notice there's a little bit of a seam here where the colors aren't blended very well. And that's because, in the past, the Content Aware technology really focused on the texture in the image. So let's undo that Command + Z or Ctrl + Z. And this time when I select Edit, Fill, I'm going to use the new Color Adaptation option. This time when I click OK, you can see that Photoshop does a much better job in blending those colors. Now let's go ahead and try to remove the other two cact...	`1.0`
`How can a beginner use the Perspective Warp feature in Adobe Photoshop to manipulate perspective, such as straightening buildings and changing the viewpoint, and what steps and controls are involved in this process according to the provided guide?`	<1-hop> >> What I want to show you in this video is something that is absolutely amazing. It's a brand new feature in Adobe Photoshop Creative Cloud called Perspective Warp. Now I have a photograph open. I didn't take this photo. It was taken by a company called PhotoSpin. And don't forget if you want to follow along, you can download the assets for this video. What I want to do first though is make a copy of it. I'm going to drag it down- this is one way to do it-make a copy. That is not necessary, but this way we get to see kind of a before and an after. Now it will work with just about any image, but your first test is to go up to the word Edit on the pull-down menu and go down, and you better see Perspective Warp. If you don't, no big deal. Just go out to the cloud, and download the latest version of Photoshop. Now what does it do? What does Perspective Warp do? It literally allows me to re-enter a three-dimensional world to change the perspective of the image as if, as the photog...	`1.0`
`Wht is the funtion of Ctrl + Z in Photoshp when you make a mistake while using Content Aware or Patch Tool, and how can beginners use it to undo their last action step by step?`	>> Removing distracting elements is easier than ever with the 2014 release of Photoshop CC. Not only has the Content Aware technology improved in terms of quality, but the speed has also been improved significantly. Let's go ahead take a look at a few examples. Now in this image of the cactus, I'm going to select the Lasso Tool and I simply want to get rid of this cactus here in the background. The easiest way to do this would be to select Edit and then Fill. I'll use Content Aware, but when I click OK, you'll notice there's a little bit of a seam here where the colors aren't blended very well. And that's because, in the past, the Content Aware technology really focused on the texture in the image. So let's undo that Command + Z or Ctrl + Z. And this time when I select Edit, Fill, I'm going to use the new Color Adaptation option. This time when I click OK, you can see that Photoshop does a much better job in blending those colors. Now let's go ahead and try to remove the other two cact...	`1.0`

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "MultipleNegativesRankingLoss",
    "matryoshka_dims": [
        384,
        256,
        128,
        64
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
num_train_epochs: 50
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 50
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
tp_size: 0
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Epoch	Step	Training Loss	cosine_ndcg@10
0.8333	10	-	0.7208
1.0	12	-	0.7208
1.6667	20	-	0.7311
2.0	24	-	0.7399
2.5	30	-	0.7447
3.0	36	-	0.7322
3.3333	40	-	0.7450
4.0	48	-	0.7416
4.1667	50	-	0.7262
5.0	60	-	0.7581
5.8333	70	-	0.7359
6.0	72	-	0.7455
6.6667	80	-	0.6918
7.0	84	-	0.6784
7.5	90	-	0.7130
8.0	96	-	0.7744
8.3333	100	-	0.7709
9.0	108	-	0.7617
9.1667	110	-	0.7657
10.0	120	-	0.7382
10.8333	130	-	0.7143
11.0	132	-	0.7143
11.6667	140	-	0.7433
12.0	144	-	0.7433
12.5	150	-	0.7433
13.0	156	-	0.7497
13.3333	160	-	0.7680
14.0	168	-	0.7270
14.1667	170	-	0.7276
15.0	180	-	0.7402
15.8333	190	-	0.7212
16.0	192	-	0.7212
16.6667	200	-	0.7296
17.0	204	-	0.6978
17.5	210	-	0.7193
18.0	216	-	0.7271
18.3333	220	-	0.7206
19.0	228	-	0.7306
19.1667	230	-	0.7306
20.0	240	-	0.7459
20.8333	250	-	0.7515
21.0	252	-	0.7494
21.6667	260	-	0.7833
22.0	264	-	0.7793
22.5	270	-	0.8032
23.0	276	-	0.7782
23.3333	280	-	0.7782
24.0	288	-	0.7782
24.1667	290	-	0.7668
25.0	300	-	0.7782
25.8333	310	-	0.7795
26.0	312	-	0.7795
26.6667	320	-	0.7820
27.0	324	-	0.7820
27.5	330	-	0.7841
28.0	336	-	0.7915
28.3333	340	-	0.7860
29.0	348	-	0.7832
29.1667	350	-	0.7897
30.0	360	-	0.7984
30.8333	370	-	0.7854
31.0	372	-	0.7839
31.6667	380	-	0.7709
32.0	384	-	0.7688
32.5	390	-	0.7681
33.0	396	-	0.7673
33.3333	400	-	0.7453
34.0	408	-	0.7638
34.1667	410	-	0.7638
35.0	420	-	0.7751
35.8333	430	-	0.7617
36.0	432	-	0.7617
36.6667	440	-	0.7652
37.0	444	-	0.7643
37.5	450	-	0.7658
38.0	456	-	0.7708
38.3333	460	-	0.7893
39.0	468	-	0.7706
39.1667	470	-	0.7706
40.0	480	-	0.7706
40.8333	490	-	0.7654
41.0	492	-	0.7654
41.6667	500	3.9103	0.7654
42.0	504	-	0.7654
42.5	510	-	0.7720
43.0	516	-	0.7904
43.3333	520	-	0.7904
44.0	528	-	0.7904
44.1667	530	-	0.7904
45.0	540	-	0.7904
45.8333	550	-	0.7904
46.0	552	-	0.7904
46.6667	560	-	0.7904
47.0	564	-	0.7904
47.5	570	-	0.7904
48.0	576	-	0.7904
48.3333	580	-	0.7904
49.0	588	-	0.7904
49.1667	590	-	0.7904
50.0	600	-	0.7904

Framework Versions

Python: 3.11.11
Sentence Transformers: 4.1.0
Transformers: 4.51.3
PyTorch: 2.7.0+cu126
Accelerate: 1.7.0
Datasets: 3.6.0
Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

mbudisic
/

snowflake-arctic-embed-s-ft-pstuts