SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • "I'd say I eat about half of what I used to. Small, frequent meals work better for me. [SEP] that's good to hear. How often does this loss of appetite happen ?"
  • 'Maybe a couple times a day. No specific times. [SEP] Have you noticed any impact on your work or academic performance?'
  • "I wake up early, as always. Haven't needed more sleep lately. [SEP] that's good. it's important to get enough rest when you're injured. Are there specific factors or worries waking you up early?"
0
  • "Thanks, I will. I'm going to keep an eye on my weight and appetite in the meantime. [SEP] Have you been experiencing any nausea along with your anxiety symptoms?"
  • 'Fine, nothing really to complain about today. [SEP] How is your digestion going? Do you have any problems?'
  • "I'm really not comfortable discussing this with you anymore. This feels impersonal and dismissive. [SEP] How is your bowel regularity? Do you have a bowel movement every day?"

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("AliaeAI/setfit_nli_v2")
# Run inference
preds = model("I stressed a lot [SEP] What kind of work do you do?")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 6 33.4038 95
Label Training Sample Count
0 1856
1 1856

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 10
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0004 1 0.2774 -
0.0216 50 0.2696 -
0.0431 100 0.2591 -
0.0647 150 0.2611 -
0.0862 200 0.2555 -
0.1078 250 0.2551 -
0.1293 300 0.2579 -
0.1509 350 0.2557 -
0.1724 400 0.2503 -
0.1940 450 0.2553 -
0.2155 500 0.2508 -
0.2371 550 0.2463 -
0.2586 600 0.2405 -
0.2802 650 0.2268 -
0.3017 700 0.2245 -
0.3233 750 0.2057 -
0.3448 800 0.2019 -
0.3664 850 0.2028 -
0.3879 900 0.1716 -
0.4095 950 0.1675 -
0.4310 1000 0.1463 -
0.4526 1050 0.1417 -
0.4741 1100 0.1259 -
0.4957 1150 0.1102 -
0.5172 1200 0.1008 -
0.5388 1250 0.0958 -
0.5603 1300 0.0947 -
0.5819 1350 0.0906 -
0.6034 1400 0.0785 -
0.625 1450 0.0757 -
0.6466 1500 0.0654 -
0.6681 1550 0.0588 -
0.6897 1600 0.0666 -
0.7112 1650 0.0536 -
0.7328 1700 0.0587 -
0.7543 1750 0.0552 -
0.7759 1800 0.0475 -
0.7974 1850 0.0406 -
0.8190 1900 0.0386 -
0.8405 1950 0.0334 -
0.8621 2000 0.0362 -
0.8836 2050 0.0279 -
0.9052 2100 0.0271 -
0.9267 2150 0.0325 -
0.9483 2200 0.0281 -
0.9698 2250 0.0365 -
0.9914 2300 0.0316 -
1.0129 2350 0.024 -
1.0345 2400 0.0237 -
1.0560 2450 0.0244 -
1.0776 2500 0.0217 -
1.0991 2550 0.0183 -
1.1207 2600 0.0175 -
1.1422 2650 0.0169 -
1.1638 2700 0.0233 -
1.1853 2750 0.019 -
1.2069 2800 0.023 -
1.2284 2850 0.0177 -
1.25 2900 0.0158 -
1.2716 2950 0.0195 -
1.2931 3000 0.0098 -
1.3147 3050 0.0202 -
1.3362 3100 0.0094 -
1.3578 3150 0.0178 -
1.3793 3200 0.0168 -
1.4009 3250 0.0184 -
1.4224 3300 0.0132 -
1.4440 3350 0.0139 -
1.4655 3400 0.0132 -
1.4871 3450 0.0131 -
1.5086 3500 0.0147 -
1.5302 3550 0.012 -
1.5517 3600 0.0134 -
1.5733 3650 0.011 -
1.5948 3700 0.0141 -
1.6164 3750 0.0078 -
1.6379 3800 0.0115 -
1.6595 3850 0.0123 -
1.6810 3900 0.0119 -
1.7026 3950 0.0143 -
1.7241 4000 0.0112 -
1.7457 4050 0.01 -
1.7672 4100 0.0139 -
1.7888 4150 0.0113 -
1.8103 4200 0.0093 -
1.8319 4250 0.0091 -
1.8534 4300 0.0124 -
1.875 4350 0.0085 -
1.8966 4400 0.009 -
1.9181 4450 0.0103 -
1.9397 4500 0.008 -
1.9612 4550 0.008 -
1.9828 4600 0.0108 -
2.0043 4650 0.0096 -
2.0259 4700 0.0086 -
2.0474 4750 0.0062 -
2.0690 4800 0.0048 -
2.0905 4850 0.006 -
2.1121 4900 0.0052 -
2.1336 4950 0.0062 -
2.1552 5000 0.0076 -
2.1767 5050 0.0084 -
2.1983 5100 0.0051 -
2.2198 5150 0.0063 -
2.2414 5200 0.0067 -
2.2629 5250 0.0058 -
2.2845 5300 0.0058 -
2.3060 5350 0.0079 -
2.3276 5400 0.0076 -
2.3491 5450 0.0101 -
2.3707 5500 0.0044 -
2.3922 5550 0.0051 -
2.4138 5600 0.0044 -
2.4353 5650 0.0043 -
2.4569 5700 0.0066 -
2.4784 5750 0.0059 -
2.5 5800 0.0097 -
2.5216 5850 0.0054 -
2.5431 5900 0.0057 -
2.5647 5950 0.0033 -
2.5862 6000 0.0049 -
2.6078 6050 0.0038 -
2.6293 6100 0.0056 -
2.6509 6150 0.006 -
2.6724 6200 0.0061 -
2.6940 6250 0.0031 -
2.7155 6300 0.0059 -
2.7371 6350 0.004 -
2.7586 6400 0.0033 -
2.7802 6450 0.0031 -
2.8017 6500 0.0062 -
2.8233 6550 0.0063 -
2.8448 6600 0.0055 -
2.8664 6650 0.0026 -
2.8879 6700 0.004 -
2.9095 6750 0.0039 -
2.9310 6800 0.005 -
2.9526 6850 0.0064 -
2.9741 6900 0.0058 -
2.9957 6950 0.0057 -

Framework Versions

  • Python: 3.11.13
  • SetFit: 1.2.0.dev0
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu124
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
2
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AliaeAI/setfit_nli_v2

Finetuned
(299)
this model