SetFit

This is a SetFit model that can be used for Text Classification. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1.0	'#ukraine Be careful social media and Google are censoring non-propaganda news story like how Ukrainian defense minister is using video games to give the impression they are defeating the Russians to keep the conflict going ! #biden is warmongering Ceasefire , peace & neutrality NOW HTTPURL' 'https://t.co/CjSFJmng7Z — Sen. Patrick Leahy (@SenatorLeahy) August 1, 2018\n' 'On Monday afternoon, Homeland Security Secretary Kirstjen Nielsen tweeted out photos of CBP officers in riot gear as well as the barbed wire and barriers citing the reports about plans to “rush” the border.\n'
0.0	'President Trump noted that President Obama and his advisers had information that the Russians had been working to interfere in the election and they ignored it, because they thought Hillary Clinton was going to win.\n' 'Once the truth is accepted that jihadis are inspired and sanctioned by their Islamic texts, it must logically become required that mosques, Islamic schools and groups have to immediately curtail any teaching that motivates sedition, violence, and hatred of unbelievers (i.e.\n' '“However, no nation has a more talented, more dedicated group of law enforcement investigators and prosecutors than the United States.”\n'

Label

Examples

1.0

'#ukraine Be careful social media and Google are censoring non-propaganda news story like how Ukrainian defense minister is using video games to give the impression they are defeating the Russians to keep the conflict going ! #biden is warmongering Ceasefire , peace & neutrality NOW HTTPURL'
'https://t.co/CjSFJmng7Z — Sen. Patrick Leahy (@SenatorLeahy) August 1, 2018\n'
'On Monday afternoon, Homeland Security Secretary Kirstjen Nielsen tweeted out photos of CBP officers in riot gear as well as the barbed wire and barriers citing the reports about plans to “rush” the border.\n'

0.0

'President Trump noted that President Obama and his advisers had information that the Russians had been working to interfere in the election and they ignored it, because they thought Hillary Clinton was going to win.\n'
'Once the truth is accepted that jihadis are inspired and sanctioned by their Islamic texts, it must logically become required that mosques, Islamic schools and groups have to immediately curtail any teaching that motivates sedition, violence, and hatred of unbelievers (i.e.\n'
'“However, no nation has a more talented, more dedicated group of law enforcement investigators and prosecutors than the United States.”\n'

Evaluation

Metrics

Label	F1
all	0.6720

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("anismahmahi/G3-setfit-model")
# Run inference
preds = model("Are you people serious?
")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	28.3246	129

Label	Training Sample Count
0	2362
1	2518

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 5
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.3302	-
0.0164	50	0.2709	-
0.0328	100	0.2545	-
0.0492	150	0.229	-
0.0656	200	0.2463	-
0.0820	250	0.2934	-
0.0984	300	0.2735	-
0.1148	350	0.2837	-
0.1311	400	0.2364	-
0.1475	450	0.2379	-
0.1639	500	0.188	-
0.1803	550	0.2443	-
0.1967	600	0.1274	-
0.2131	650	0.2106	-
0.2295	700	0.3211	-
0.2459	750	0.2443	-
0.2623	800	0.1979	-
0.2787	850	0.1679	-
0.2951	900	0.1208	-
0.3115	950	0.0594	-
0.3279	1000	0.11	-
0.3443	1050	0.0951	-
0.3607	1100	0.1059	-
0.3770	1150	0.1027	-
0.3934	1200	0.0771	-
0.4098	1250	0.0295	-
0.4262	1300	0.0696	-
0.4426	1350	0.104	-
0.4590	1400	0.13	-
0.4754	1450	0.1287	-
0.4918	1500	0.0264	-
0.5082	1550	0.0651	-
0.5246	1600	0.113	-
0.5410	1650	0.07	-
0.5574	1700	0.0016	-
0.5738	1750	0.1001	-
0.5902	1800	0.0116	-
0.6066	1850	0.01	-
0.6230	1900	0.0115	-
0.6393	1950	0.0053	-
0.6557	2000	0.0585	-
0.6721	2050	0.0034	-
0.6885	2100	0.0171	-
0.7049	2150	0.0141	-
0.7213	2200	0.0549	-
0.7377	2250	0.0026	-
0.7541	2300	0.1239	-
0.7705	2350	0.0121	-
0.7869	2400	0.0589	-
0.8033	2450	0.0042	-
0.8197	2500	0.0026	-
0.8361	2550	0.003	-
0.8525	2600	0.0004	-
0.8689	2650	0.0003	-
0.8852	2700	0.1	-
0.9016	2750	0.0567	-
0.9180	2800	0.0311	-
0.9344	2850	0.0404	-
0.9508	2900	0.0002	-
0.9672	2950	0.0008	-
0.9836	3000	0.0006	-
1.0	3050	0.0003	0.3187
1.0164	3100	0.0003	-
1.0328	3150	0.0002	-
1.0492	3200	0.0002	-
1.0656	3250	0.002	-
1.0820	3300	0.0002	-
1.0984	3350	0.0003	-
1.1148	3400	0.005	-
1.1311	3450	0.0613	-
1.1475	3500	0.0002	-
1.1639	3550	0.0002	-
1.1803	3600	0.0005	-
1.1967	3650	0.0001	-
1.2131	3700	0.0609	-
1.2295	3750	0.0003	-
1.2459	3800	0.0005	-
1.2623	3850	0.0006	-
1.2787	3900	0.0003	-
1.2951	3950	0.0014	-
1.3115	4000	0.0002	-
1.3279	4050	0.0001	-
1.3443	4100	0.0002	-
1.3607	4150	0.001	-
1.3770	4200	0.0004	-
1.3934	4250	0.0004	-
1.4098	4300	0.0002	-
1.4262	4350	0.0612	-
1.4426	4400	0.0613	-
1.4590	4450	0.0002	-
1.4754	4500	0.0603	-
1.4918	4550	0.0001	-
1.5082	4600	0.0011	-
1.5246	4650	0.0576	-
1.5410	4700	0.0001	-
1.5574	4750	0.0002	-
1.5738	4800	0.0002	-
1.5902	4850	0.0012	-
1.6066	4900	0.0003	-
1.6230	4950	0.0001	-
1.6393	5000	0.0001	-
1.6557	5050	0.0001	-
1.6721	5100	0.0001	-
1.6885	5150	0.0001	-
1.7049	5200	0.0002	-
1.7213	5250	0.0001	-
1.7377	5300	0.0002	-
1.7541	5350	0.0001	-
1.7705	5400	0.0001	-
1.7869	5450	0.0001	-
1.8033	5500	0.0001	-
1.8197	5550	0.0003	-
1.8361	5600	0.0001	-
1.8525	5650	0.0001	-
1.8689	5700	0.0001	-
1.8852	5750	0.0001	-
1.9016	5800	0.0002	-
1.9180	5850	0.0	-
1.9344	5900	0.0001	-
1.9508	5950	0.0	-
1.9672	6000	0.0	-
1.9836	6050	0.0001	-
2.0	6100	0.0001	0.3313

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.1
Sentence Transformers: 2.2.2
Transformers: 4.35.2
PyTorch: 2.1.0+cu121
Datasets: 2.16.1
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 2

Safetensors

Model size

109M params

Tensor type

F32

Evaluation results

F1 on Unknown
test set self-reported

0.672

View on Papers With Code