Safetensors
Tigrinya
roberta
Eval Results
fgaim commited on
Commit
3422a3b
·
1 Parent(s): 57a5444

Add a multitask trained model and model card

Browse files
.gitignore ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ .ipynb_checkpoints/
2
+
README.md CHANGED
@@ -1,3 +1,138 @@
1
  ---
2
  license: cc-by-4.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ language: ti
4
+ widget:
5
+ - text: "<text-to-classify>"
6
+ datasets:
7
+ - fgaim/tigrinya-abusive-language-detection
8
+ metrics:
9
+ - accuracy
10
+ - f1
11
+ - precision
12
+ - recall
13
+ model-index:
14
+ - name: tiroberta-tiald-all-tasks
15
+ results:
16
+ - task:
17
+ name: Text Classification
18
+ type: text-classification
19
+ metrics:
20
+ - name: Abu Accuracy
21
+ type: accuracy
22
+ value: 0.8611111111111112
23
+ - name: F1
24
+ type: f1
25
+ value: 0.8611109396431353
26
+ - name: Precision
27
+ type: precision
28
+ value: 0.8611128943846637
29
+ - name: Recall
30
+ type: recall
31
+ value: 0.8611111111111112
32
  ---
33
+
34
+
35
+ # TiRoBERTa Fine-tuned for Multi-task Abusiveness, Sentiment, and Topic Classification
36
+
37
+ This model is a fine-tuned version of [TiRoBERTa](https://huggingface.co/fgaim/tiroberta-base) on the [TiALD](https://huggingface.co/datasets/fgaim/tigrinya-abusive-language-detection) dataset.
38
+
39
+ **Tigrinya Abusive Language Detection (TiALD) Dataset** is a large-scale, multi-task benchmark dataset for abusive language detection in the Tigrinya language. It consists of **13,717 YouTube comments** annotated for **abusiveness**, **sentiment**, and **topic** tasks. The dataset includes comments written in both the **Ge’ez script** and prevalent non-standard Latin **transliterations** to mirror real-world usage.
40
+
41
+ > ⚠️ The dataset contains explicit, obscene, and potentially hateful language. It should be used for research purposes only. ⚠️
42
+
43
+ This work accompanies the paper ["A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings"](https://arxiv.org/abs/2505.12116).
44
+
45
+ ## Model Usage
46
+
47
+ ```python
48
+ from transformers import pipeline
49
+
50
+ tiald_multitask = pipeline("text-classification", model="fgaim/tiroberta-tiald-all-tasks", top_k=11)
51
+ tiald_multitask("<text-to-classify>")
52
+ ```
53
+
54
+ ### Performance Metrics
55
+
56
+ This model achieves the following results on the TiALD test set:
57
+
58
+ ```json
59
+ "abusiveness_metrics": {
60
+ "accuracy": 0.8611111111111112,
61
+ "macro_f1": 0.8611109396431353,
62
+ "macro_recall": 0.8611111111111112,
63
+ "macro_precision": 0.8611128943846637,
64
+ "weighted_f1": 0.8611109396431355,
65
+ "weighted_recall": 0.8611111111111112,
66
+ "weighted_precision": 0.8611128943846637
67
+ },
68
+ "topic_metrics": {
69
+ "accuracy": 0.6155555555555555,
70
+ "macro_f1": 0.5491185274678864,
71
+ "macro_recall": 0.5143416011263588,
72
+ "macro_precision": 0.7341640739780486,
73
+ "weighted_f1": 0.5944096153417657,
74
+ "weighted_recall": 0.6155555555555555,
75
+ "weighted_precision": 0.6870800624645906
76
+ },
77
+ "sentiment_metrics": {
78
+ "accuracy": 0.6533333333333333,
79
+ "macro_f1": 0.5340845253007789,
80
+ "macro_recall": 0.5410170159158625,
81
+ "macro_precision": 0.534652401599494,
82
+ "weighted_f1": 0.6620101614004723,
83
+ "weighted_recall": 0.6533333333333333,
84
+ "weighted_precision": 0.6750245466592532
85
+ }
86
+ ```
87
+
88
+ ## Training Hyperparameters
89
+
90
+ The following hyperparameters were used during training:
91
+
92
+ - learning_rate: 3e-05
93
+ - train_batch_size: 8
94
+ - optimizer: Adam (betas=0.9, 0.999, epsilon=1e-08)
95
+ - lr_scheduler_type: linear
96
+ - num_epochs: 7.0
97
+ - seed: 42
98
+
99
+ ## Intended Usage
100
+
101
+ The TiALD dataset and models designed to support:
102
+
103
+ - Research in abusive language detection in low-resource languages
104
+ - Context-aware abuse, sentiment, and topic modeling
105
+ - Multi-task and transfer learning with digraphic scripts
106
+ - Evaluation of multilingual and fine-tuned language models
107
+
108
+ Researchers and developers should avoid using this dataset for direct moderation or enforcement tasks without human oversight.
109
+
110
+ ## Ethical Considerations
111
+
112
+ - **Sensitive content**: Contains toxic and offensive language. Use for research purposes only.
113
+ - **Cultural sensitivity**: Abuse is context-dependent; annotations were made by native speakers to account for cultural nuance.
114
+ - **Bias mitigation**: Data sampling and annotation were carefully designed to minimize reinforcement of stereotypes.
115
+ - **Privacy**: All the source content for the dataset is publicly available on YouTube.
116
+ - **Respect for expression**: The dataset should not be used for automated censorship without human review.
117
+
118
+ This research received IRB approval (Ref: KH2022-133) and followed ethical data collection and annotation practices, including informed consent of annotators.
119
+
120
+ ## Citation
121
+
122
+ If you use this model or the `TiALD` dataset in your work, please cite:
123
+
124
+ ```bibtex
125
+ @misc{gaim-etal-2025-tiald-benchmark,
126
+ title = {A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings},
127
+ author = {Fitsum Gaim and Hoyun Song and Huije Lee and Changgeon Ko and Eui Jun Hwang and Jong C. Park},
128
+ year = {2025},
129
+ eprint = {2505.12116},
130
+ archiveprefix = {arXiv},
131
+ primaryclass = {cs.CL},
132
+ url = {https://arxiv.org/abs/2505.12116}
133
+ }
134
+ ```
135
+
136
+ ## License
137
+
138
+ This dataset is released under the [Creative Commons Attribution 4.0 International License (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/).
best_trial.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "learning_rate": 3e-05,
3
+ "train_batch_size": 8,
4
+ "num_train_epochs": 7
5
+ }
config.json ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "RobertaForMultiLabelSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "bos_token_id": 0,
7
+ "classifier_dropout": null,
8
+ "eos_token_id": 2,
9
+ "gradient_checkpointing": false,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "id2label": {
14
+ "0": "Abusive",
15
+ "1": "Not Abusive",
16
+ "2": "Political",
17
+ "3": "Racial",
18
+ "4": "Religious",
19
+ "5": "Sexist",
20
+ "6": "Other Topic",
21
+ "7": "Positive",
22
+ "8": "Neutral",
23
+ "9": "Negative",
24
+ "10": "Mixed Sentiment"
25
+ },
26
+ "initializer_range": 0.02,
27
+ "intermediate_size": 3072,
28
+ "label2id": {
29
+ "Abusive": 0,
30
+ "Not Abusive": 1,
31
+ "Political": 2,
32
+ "Racial": 3,
33
+ "Religious": 4,
34
+ "Sexist": 5,
35
+ "Other Topic": 6,
36
+ "Positive": 7,
37
+ "Neutral": 8,
38
+ "Negative": 9,
39
+ "Mixed Sentiment": 10
40
+ },
41
+ "layer_norm_eps": 1e-05,
42
+ "max_position_embeddings": 514,
43
+ "model_type": "roberta",
44
+ "num_attention_heads": 12,
45
+ "num_hidden_layers": 12,
46
+ "pad_token_id": 1,
47
+ "position_embedding_type": "absolute",
48
+ "torch_dtype": "float32",
49
+ "transformers_version": "4.51.3",
50
+ "type_vocab_size": 1,
51
+ "use_cache": true,
52
+ "vocab_size": 50265
53
+ }
eval_results.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ LRAP = 0.7497833627278061
2
+ eval_loss = 10.723778989579943
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a87fe5bc9d0030692331fef5155bc2980213d4e1e1a363a86202755c417e3c42
3
+ size 501003076
model_args.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"adafactor_beta1": null, "adafactor_clip_threshold": 1.0, "adafactor_decay_rate": -0.8, "adafactor_eps": [1e-30, 0.001], "adafactor_relative_step": true, "adafactor_scale_parameter": true, "adafactor_warmup_init": true, "adam_betas": [0.9, 0.999], "adam_epsilon": 1e-08, "best_model_dir": "outputs/best_model", "cache_dir": "cache_dir/", "config": {}, "cosine_schedule_num_cycles": 0.5, "custom_layer_parameters": [], "custom_parameter_groups": [], "dataloader_num_workers": 0, "do_lower_case": false, "dynamic_quantize": false, "early_stopping_consider_epochs": false, "early_stopping_delta": 0, "early_stopping_metric": "macro_f1", "early_stopping_metric_minimize": false, "early_stopping_patience": 3, "encoding": null, "eval_batch_size": 100, "evaluate_during_training": true, "evaluate_during_training_silent": true, "evaluate_during_training_steps": 2000, "evaluate_during_training_verbose": false, "evaluate_each_epoch": true, "fp16": true, "gradient_accumulation_steps": 1, "learning_rate": 3e-05, "local_rank": -1, "logging_steps": 1, "loss_type": null, "loss_args": {}, "manual_seed": null, "max_grad_norm": 1.0, "max_seq_length": 256, "model_name": "fgaim/tiroberta-base", "model_type": "roberta", "multiprocessing_chunksize": -1, "n_gpu": 1, "no_cache": false, "no_save": false, "not_saved_args": [], "num_train_epochs": 7, "optimizer": "AdamW", "output_dir": "models/tiroberta-base", "overwrite_output_dir": true, "polynomial_decay_schedule_lr_end": 1e-07, "polynomial_decay_schedule_power": 1.0, "process_count": 6, "quantized_model": false, "reprocess_input_data": true, "save_best_model": true, "save_eval_checkpoints": false, "save_model_every_epoch": false, "save_optimizer_and_scheduler": true, "save_steps": 2000, "scheduler": "linear_schedule_with_warmup", "silent": false, "skip_special_tokens": true, "tensorboard_dir": null, "thread_count": null, "tokenizer_name": null, "tokenizer_type": null, "train_batch_size": 8, "train_custom_parameters_only": false, "trust_remote_code": false, "use_cached_eval_features": false, "use_early_stopping": true, "use_hf_datasets": false, "use_multiprocessing": false, "use_multiprocessing_for_evaluation": false, "wandb_kwargs": {"job_type": "training", "name": "tiroberta-base-20250510_063424"}, "wandb_project": "tiald-joint-labels", "warmup_ratio": 0.1, "warmup_steps": 1078, "weight_decay": 0.01, "model_class": "MultiLabelClassificationModel", "sliding_window": false, "stride": 0.8, "threshold": 0.5, "tie_value": 1, "labels_list": [], "labels_map": {}, "lazy_loading": false, "special_tokens_list": []}
predictions_test.json ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "4": {
37
+ "content": "<mask>",
38
+ "lstrip": false,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": false,
47
+ "cls_token": "<s>",
48
+ "do_lower_case": false,
49
+ "eos_token": "</s>",
50
+ "errors": "replace",
51
+ "extra_special_tokens": {},
52
+ "mask_token": "<mask>",
53
+ "model_max_length": 1000000000000000019884624838656,
54
+ "pad_token": "<pad>",
55
+ "sep_token": "</s>",
56
+ "tokenizer_class": "RobertaTokenizer",
57
+ "unk_token": "<unk>"
58
+ }
training_progress_scores.csv ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ global_step,train_loss,LRAP,accuracy,macro_f1,weighted_f1,eval_loss
2
+ 1540,5.5235595703125,0.7087386163219498,0.0022222222222222222,0.5276624797769354,0.6418452955098061,6.762324757046169
3
+ 2000,3.336136817932129,0.713298140131474,0.0033333333333333335,0.5490219749315628,0.6538235528242954,6.732627603742811
4
+ 3080,5.402691841125488,0.7355552081663191,0.0011111111111111111,0.5690883792182807,0.6738836283404934,5.939486159218682
5
+ 4000,4.94577693939209,0.7516409064186831,0.005555555555555556,0.5857144395278816,0.6937109275167757,6.191740194956462
6
+ 4620,4.7006120681762695,0.7349451258617915,0.0022222222222222222,0.5808680848220031,0.6845453635469864,6.53547477722168
7
+ 6000,0.8756833076477051,0.7496383277216598,0.028888888888888888,0.5848889297971004,0.6915240625034748,7.2687596744961205
8
+ 6160,2.8979897499084473,0.7404720218053544,0.03888888888888889,0.5797768839785963,0.6882614706527342,7.9485422770182295
9
+ 7700,0.5499695539474487,0.7383599353321573,0.06444444444444444,0.5910478801573911,0.6963464661450791,8.710189289516872
10
+ 8000,0.31079328060150146,0.7527138848805496,0.09222222222222222,0.5948021214697299,0.7003535118782406,9.066656377580431
11
+ 9240,0.6766080856323242,0.7457099166265829,0.12555555555555556,0.5990810767553655,0.7036516907901794,10.211644013722738
12
+ 10000,1.856939435005188,0.7473248490192927,0.13444444444444445,0.5910801083627714,0.7000364330564518,10.624637179904514
13
+ 10780,0.12249794602394104,0.7497833627278061,0.14888888888888888,0.5969603348439917,0.7040227106765218,10.723778989579943
vocab.json ADDED
The diff for this file is too large to render. See raw diff