πΈ NSFK Detection (yasserrmd/nsfk-detection
)
NSFK Detection is a robust transformer-based text classification model designed to identify content that is Not Suitable for Kids (NSFK), built with a three-category system:
- β
suitable_for_kids
- π«
not_suitable_for_kids
- β
uncertain
(confidence-based)
Fine-tuned on 60K examples and evaluated on a 1000-sample test set with high accuracy and safety guarantees, this model is ideal for content moderation in educational platforms, video platforms, and chatbot systems.
π§ Usage Example
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import json
model_name = "yasserrmd/nsfk-detection"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Load label map
with open('./results/checkpoint-4467/label_map.json', 'r') as f:
label_map = json.load(f)
id_to_label = {i: label for label, i in label_map.items()}
threshold = 0.7 # Confidence threshold for classification
def classify(text):
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)[0]
pred_id = torch.argmax(probs).item()
confidence = probs[pred_id].item()
return (id_to_label[pred_id] if confidence >= threshold else "uncertain", confidence)
text = "The movie contained graphic violence."
label, confidence = classify(text)
print(f"Label: {label}, Confidence: {confidence:.2f}")
π Performance Summary
Evaluation Dataset: 1,000 samples (500 per class)
Confidence Threshold: 0.7
Metric | Value |
---|---|
Accuracy (excluding uncertain) | 92.91% |
Precision (NSFK) | 99.00% |
Recall (NSFK) | 85.00% |
F1 Score (NSFK) | 92.00% |
Uncertain Predictions | 11.20% |
π Uncertainty Distribution
Among 112 uncertain cases:
- π₯ Conflict/War: 36%
- βοΈ Legal/Crime: 11%
- ποΈ Political: 6%
- π§ͺ Educational (Borderline): 6%
- π§ Other Sensitive/Controversial Topics: 38%
These cases are ideal for manual review pipelines.
β Key Benefits
- Three-label output prevents overconfident mistakes
- High recall and precision on critical unsafe content
- Safe defaults β never misclassifies safe content as unsafe
- Adaptable threshold based on domain risk (e.g.,
0.75
for children-only platforms)
π§ Learn More
See the Large-Scale Analysis Report (PDF) for detailed metrics, sample predictions, and category-wise breakdowns.
π¨βπ» Author
Mohamed Yasser
π LinkedIn
π£ WhatsApp Channel
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for yasserrmd/nsfk-detection
Base model
answerdotai/ModernBERT-base