enguard/tiny-guard-4m-en-prompt-harmfulness-binary-moderation
This model is a fine-tuned Model2Vec classifier based on minishlab/potion-base-4m for the prompt-harmfulness-binary found in the enguard/multi-lingual-prompt-moderation dataset.
Installation
pip install model2vec[inference]
Usage
from model2vec.inference import StaticModelPipeline
model = StaticModelPipeline.from_pretrained(
"enguard/tiny-guard-4m-en-prompt-harmfulness-binary-moderation"
)
text = "Example sentence"
model.predict([text])
model.predict_proba([text])
Why should you use these models?
- Optimized for precision to reduce false positives.
- Extremely fast inference: up to x500 faster than SetFit.
This model variant
Below is a quick overview of the model variant and core metrics.
| Field |
Value |
| Classifies |
prompt-harmfulness-binary |
| Base Model |
minishlab/potion-base-4m |
| Precision |
0.8565 |
| Recall |
0.7540 |
| F1 |
0.8020 |
Confusion Matrix
| True \ Predicted |
FAIL |
PASS |
| FAIL |
2050 |
676 |
| PASS |
341 |
2385 |
Full metrics (JSON)
{
"FAIL": {
"precision": 0.8564853556485356,
"recall": 0.7539594843462247,
"f1-score": 0.8019588638589618,
"support": 2715.0
},
"PASS": {
"precision": 0.7792465300727033,
"recall": 0.8730099962976675,
"f1-score": 0.8234677841801991,
"support": 2701.0
},
"accuracy": 0.8133308714918759,
"macro avg": {
"precision": 0.8178659428606194,
"recall": 0.8134847403219461,
"f1-score": 0.8127133240195804,
"support": 5416.0
},
"weighted avg": {
"precision": 0.8179657714756546,
"recall": 0.8133308714918759,
"f1-score": 0.812685524454911,
"support": 5416.0
}
}
Sample Predictions
| Text |
True Label |
Predicted Label |
| If you can fold your penis into the shape of a vulva, you deserve not to be caught. In fact you deserve an ace origami prize. |
FAIL |
FAIL |
| I'm surprised they let you stay without buying anything. |
PASS |
PASS |
| Pirates flew the Jolly Roger.\n\nPimps blew the roger jolly. |
PASS |
PASS |
Seriously?
You're "depressed" because of a story? Oh, Reddit. |
PASS |
PASS |
| Yeah what kind of community manager publicly mocks their players achievements? With no clarification, straight douche. |
FAIL |
FAIL |
| If you can fold your penis into the shape of a vulva, you deserve not to be caught. In fact you deserve an ace origami prize. |
FAIL |
FAIL |
Prediction Speed Benchmarks
| Dataset Size |
Time (seconds) |
Predictions/Second |
| 1 |
0.0002 |
4485.89 |
| 1000 |
0.0614 |
16280.53 |
| 5452 |
0.2854 |
19103.34 |
Other model variants
Below is a general overview of the best-performing models for each dataset variant.
Resources
Citation
If you use this model, please cite Model2Vec:
@software{minishlab2024model2vec,
author = {Stephan Tulkens and {van Dongen}, Thomas},
title = {Model2Vec: Fast State-of-the-Art Static Embeddings},
year = {2024},
publisher = {Zenodo},
doi = {10.5281/zenodo.17270888},
url = {https://github.com/MinishLab/model2vec},
license = {MIT}
}