Advanced Cyberbullying & Offensive Language Detector
This is a multilingual, multi-label text classification model fine-tuned on a combined dataset of cyberbullying_tweets
and the OLID (Offensive Language Identification Dataset). It is designed to detect both general offensive language and specific, targeted types of cyberbullying.
This model was trained using the xlm-roberta-base
architecture.
Model Labels
The model predicts six independent labels for any given text. A 1
indicates the presence of the category, and a 0
indicates its absence.
is_offensive
: The text contains generally offensive, toxic, or profane language.is_gender_harassment
: The text contains attacks based on gender, sexism, or sexual orientation.is_religious_harassment
: The text contains attacks targeting religious beliefs.is_ethnic_harassment
: The text contains attacks based on race or ethnicity.is_age_harassment
: The text contains attacks targeting a person's age (ageism).is_other_cyberbullying
: The text contains general insults or bullying that doesn't fit the other specific categories.
A text is considered "Not Cyberbullying" when all labels are predicted as 0
.
Performance
The model achieved the following performance on its validation set after two epochs of training:
- F1 Score (Weighted): 0.908
- ROC AUC (Weighted): 0.961
How to Use
Once you upload this model to your Hugging Face account, you can use it directly with a pipeline
:
from transformers import pipeline
# Replace "your-username/your-repo-name" with your actual model name on the Hub
pipe = pipeline("text-classification", model="your-username/your-repo-name", return_all_scores=True)
# Example usage
text = "You are such an idiot, I can't believe you said that."
results = pipe(text)
print(results)```
- Downloads last month
- 20