Add a multitask trained model and model card

3422a3b about 2 months ago

5.06 kB

	---
	license: cc-by-4.0
	language: ti
	widget:
	- text: "<text-to-classify>"
	datasets:
	- fgaim/tigrinya-abusive-language-detection
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	model-index:
	- name: tiroberta-tiald-all-tasks
	results:
	- task:
	name: Text Classification
	type: text-classification
	metrics:
	- name: Abu Accuracy
	type: accuracy
	value: 0.8611111111111112
	- name: F1
	type: f1
	value: 0.8611109396431353
	- name: Precision
	type: precision
	value: 0.8611128943846637
	- name: Recall
	type: recall
	value: 0.8611111111111112
	---


	# TiRoBERTa Fine-tuned for Multi-task Abusiveness, Sentiment, and Topic Classification

	This model is a fine-tuned version of [TiRoBERTa](https://huggingface.co/fgaim/tiroberta-base) on the [TiALD](https://huggingface.co/datasets/fgaim/tigrinya-abusive-language-detection) dataset.

	Tigrinya Abusive Language Detection (TiALD) Dataset is a large-scale, multi-task benchmark dataset for abusive language detection in the Tigrinya language. It consists of 13,717 YouTube comments annotated for abusiveness, sentiment, and topic tasks. The dataset includes comments written in both the Ge’ez script and prevalent non-standard Latin transliterations to mirror real-world usage.

	> ⚠️ The dataset contains explicit, obscene, and potentially hateful language. It should be used for research purposes only. ⚠️

	This work accompanies the paper ["A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings"](https://arxiv.org/abs/2505.12116).

	## Model Usage

	```python
	from transformers import pipeline

	tiald_multitask = pipeline("text-classification", model="fgaim/tiroberta-tiald-all-tasks", top_k=11)
	tiald_multitask("<text-to-classify>")
	```

	### Performance Metrics

	This model achieves the following results on the TiALD test set:

	```json
	"abusiveness_metrics": {
	"accuracy": 0.8611111111111112,
	"macro_f1": 0.8611109396431353,
	"macro_recall": 0.8611111111111112,
	"macro_precision": 0.8611128943846637,
	"weighted_f1": 0.8611109396431355,
	"weighted_recall": 0.8611111111111112,
	"weighted_precision": 0.8611128943846637
	},
	"topic_metrics": {
	"accuracy": 0.6155555555555555,
	"macro_f1": 0.5491185274678864,
	"macro_recall": 0.5143416011263588,
	"macro_precision": 0.7341640739780486,
	"weighted_f1": 0.5944096153417657,
	"weighted_recall": 0.6155555555555555,
	"weighted_precision": 0.6870800624645906
	},
	"sentiment_metrics": {
	"accuracy": 0.6533333333333333,
	"macro_f1": 0.5340845253007789,
	"macro_recall": 0.5410170159158625,
	"macro_precision": 0.534652401599494,
	"weighted_f1": 0.6620101614004723,
	"weighted_recall": 0.6533333333333333,
	"weighted_precision": 0.6750245466592532
	}
	```

	## Training Hyperparameters

	The following hyperparameters were used during training:

	- learning_rate: 3e-05
	- train_batch_size: 8
	- optimizer: Adam (betas=0.9, 0.999, epsilon=1e-08)
	- lr_scheduler_type: linear
	- num_epochs: 7.0
	- seed: 42

	## Intended Usage

	The TiALD dataset and models designed to support:

	- Research in abusive language detection in low-resource languages
	- Context-aware abuse, sentiment, and topic modeling
	- Multi-task and transfer learning with digraphic scripts
	- Evaluation of multilingual and fine-tuned language models

	Researchers and developers should avoid using this dataset for direct moderation or enforcement tasks without human oversight.

	## Ethical Considerations

	- Sensitive content: Contains toxic and offensive language. Use for research purposes only.
	- Cultural sensitivity: Abuse is context-dependent; annotations were made by native speakers to account for cultural nuance.
	- Bias mitigation: Data sampling and annotation were carefully designed to minimize reinforcement of stereotypes.
	- Privacy: All the source content for the dataset is publicly available on YouTube.
	- Respect for expression: The dataset should not be used for automated censorship without human review.

	This research received IRB approval (Ref: KH2022-133) and followed ethical data collection and annotation practices, including informed consent of annotators.

	## Citation

	If you use this model or the `TiALD` dataset in your work, please cite:

	```bibtex
	@misc{gaim-etal-2025-tiald-benchmark,
	title = {A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings},
	author = {Fitsum Gaim and Hoyun Song and Huije Lee and Changgeon Ko and Eui Jun Hwang and Jong C. Park},
	year = {2025},
	eprint = {2505.12116},
	archiveprefix = {arXiv},
	primaryclass = {cs.CL},
	url = {https://arxiv.org/abs/2505.12116}
	}
	```

	## License

	This dataset is released under the [Creative Commons Attribution 4.0 International License (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/).