Advanced Cyberbullying & Offensive Language Detector

This is a multilingual, multi-label text classification model fine-tuned on a combined dataset of cyberbullying_tweets and the OLID (Offensive Language Identification Dataset). It is designed to detect both general offensive language and specific, targeted types of cyberbullying.

This model was trained using the xlm-roberta-base architecture.

Model Labels

The model predicts six independent labels for any given text. A 1 indicates the presence of the category, and a 0 indicates its absence.

is_offensive: The text contains generally offensive, toxic, or profane language.
is_gender_harassment: The text contains attacks based on gender, sexism, or sexual orientation.
is_religious_harassment: The text contains attacks targeting religious beliefs.
is_ethnic_harassment: The text contains attacks based on race or ethnicity.
is_age_harassment: The text contains attacks targeting a person's age (ageism).
is_other_cyberbullying: The text contains general insults or bullying that doesn't fit the other specific categories.

A text is considered "Not Cyberbullying" when all labels are predicted as 0.

Performance

The model achieved the following performance on its validation set after two epochs of training:

F1 Score (Weighted): 0.908
ROC AUC (Weighted): 0.961

How to Use

Once you upload this model to your Hugging Face account, you can use it directly with a pipeline:

from transformers import pipeline

# Replace "your-username/your-repo-name" with your actual model name on the Hub
pipe = pipeline("text-classification", model="your-username/your-repo-name", return_all_scores=True)

# Example usage
text = "You are such an idiot, I can't believe you said that."
results = pipe(text)
print(results)```