TiRoBERTa Fine-tuned for Multi-task Abusiveness, Sentiment, and Topic Classification
This model is a fine-tuned version of TiRoBERTa on the TiALD dataset.
Tigrinya Abusive Language Detection (TiALD) Dataset is a large-scale, multi-task benchmark dataset for abusive language detection in the Tigrinya language. It consists of 13,717 YouTube comments annotated for abusiveness, sentiment, and topic tasks. The dataset includes comments written in both the Ge’ez script and prevalent non-standard Latin transliterations to mirror real-world usage.
⚠️ The dataset contains explicit, obscene, and potentially hateful language. It should be used for research purposes only. ⚠️
This work accompanies the paper "A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings".
Model Usage
from transformers import pipeline
tiald_multitask = pipeline("text-classification", model="fgaim/tiroberta-tiald-all-tasks", top_k=11)
tiald_multitask("<text-to-classify>")
Performance Metrics
This model achieves the following results on the TiALD test set:
"abusiveness_metrics": {
"accuracy": 0.8611111111111112,
"macro_f1": 0.8611109396431353,
"macro_recall": 0.8611111111111112,
"macro_precision": 0.8611128943846637,
"weighted_f1": 0.8611109396431355,
"weighted_recall": 0.8611111111111112,
"weighted_precision": 0.8611128943846637
},
"topic_metrics": {
"accuracy": 0.6155555555555555,
"macro_f1": 0.5491185274678864,
"macro_recall": 0.5143416011263588,
"macro_precision": 0.7341640739780486,
"weighted_f1": 0.5944096153417657,
"weighted_recall": 0.6155555555555555,
"weighted_precision": 0.6870800624645906
},
"sentiment_metrics": {
"accuracy": 0.6533333333333333,
"macro_f1": 0.5340845253007789,
"macro_recall": 0.5410170159158625,
"macro_precision": 0.534652401599494,
"weighted_f1": 0.6620101614004723,
"weighted_recall": 0.6533333333333333,
"weighted_precision": 0.6750245466592532
}
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 8
- optimizer: Adam (betas=0.9, 0.999, epsilon=1e-08)
- lr_scheduler_type: linear
- num_epochs: 7.0
- seed: 42
Intended Usage
The TiALD dataset and models designed to support:
- Research in abusive language detection in low-resource languages
- Context-aware abuse, sentiment, and topic modeling
- Multi-task and transfer learning with digraphic scripts
- Evaluation of multilingual and fine-tuned language models
Researchers and developers should avoid using this dataset for direct moderation or enforcement tasks without human oversight.
Ethical Considerations
- Sensitive content: Contains toxic and offensive language. Use for research purposes only.
- Cultural sensitivity: Abuse is context-dependent; annotations were made by native speakers to account for cultural nuance.
- Bias mitigation: Data sampling and annotation were carefully designed to minimize reinforcement of stereotypes.
- Privacy: All the source content for the dataset is publicly available on YouTube.
- Respect for expression: The dataset should not be used for automated censorship without human review.
This research received IRB approval (Ref: KH2022-133) and followed ethical data collection and annotation practices, including informed consent of annotators.
Citation
If you use this model or the TiALD
dataset in your work, please cite:
@misc{gaim-etal-2025-tiald-benchmark,
title = {A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings},
author = {Fitsum Gaim and Hoyun Song and Huije Lee and Changgeon Ko and Eui Jun Hwang and Jong C. Park},
year = {2025},
eprint = {2505.12116},
archiveprefix = {arXiv},
primaryclass = {cs.CL},
url = {https://arxiv.org/abs/2505.12116}
}
License
This dataset is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
- Downloads last month
- 214
Dataset used to train fgaim/tiroberta-tiald-multi-task
Evaluation results
- Abu Accuracyself-reported0.861
- F1self-reported0.861
- Precisionself-reported0.861
- Recallself-reported0.861