Update README.md
Browse files
README.md
CHANGED
@@ -35,7 +35,14 @@ tags:
|
|
35 |
|
36 |
- **Target**: Racism, homophobia, sexism, transphobia and other forms of discrimination.
|
37 |
|
38 |
-
## 2.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
|
40 |
- **Previous Version (v1)**: Fine-tuned on the [**Paul/hatecheck-spanish**](https://huggingface.co/datasets/Paul/hatecheck-spanish) dataset, but real-world testing revealed performance issues, limiting its effectiveness.
|
41 |
|
@@ -46,7 +53,7 @@ tags:
|
|
46 |
- **Incorporation of Paul Samples**: After evaluating the results, it was clear that including key samples from the **Paul dataset** would help the model capture additional nuanced forms of hate speech, such as **transphobia** and **multiple types of racism**.
|
47 |
- A significant amount of effort went into carefully selecting and processing these samples from the Paul dataset and integrating them with the **manueltonneau** dataset. This careful curation created a more comprehensive dataset, **enhancing the model's ability to differentiate between hate and non-hate speech**.
|
48 |
|
49 |
-
##
|
50 |
|
51 |
To prepare the datasets for fine-tuning and ensure optimal model performance, the following steps were undertaken:
|
52 |
|
@@ -89,13 +96,13 @@ label_mapping = {0.0: 0, 1.0: 1}
|
|
89 |
- Applied dynamic padding using the Hugging Face DataCollator to handle varying text lengths efficiently.
|
90 |
- Batch settings: batch_size=8, shuffle=True.
|
91 |
|
92 |
-
##
|
93 |
|
94 |
- **Greater Accuracy**: The inclusion of diverse samples led to a more balanced model that can better handle different forms of discrimination.
|
95 |
- **Precision in Detecting Non-Hate Speech**: The model is now more reliable at detecting non-hateful content, minimizing false positives.
|
96 |
- **Robustness**: The updated model performs better in real-world scenarios, offering stronger results for content moderation tasks.
|
97 |
|
98 |
-
##
|
99 |
|
100 |
- This model is optimized for content moderation on online platforms, where it can detect harmful speech and help foster safer online environments.
|
101 |
- **Classification Task**: The model categorizes text into two labels:
|
@@ -103,15 +110,15 @@ label_mapping = {0.0: 0, 1.0: 1}
|
|
103 |
- **Non-Hateful (LABEL_0)**: Content that does not contain hate speech and is neutral or constructive.
|
104 |
- **Hateful (LABEL_1)**: Content that promotes hate speech or harmful rhetoric.
|
105 |
|
106 |
-
##
|
107 |
|
108 |
The **goal** of the model is to identify content that promotes **harmful rhetoric** or **behavior**, while distinguishing it from **neutral** or **constructive speech**. This makes it highly applicable for **moderating online content**, ensuring that **harmful speech** and **behavior** are flagged while maintaining the integrity of **non-hateful communication**. By accurately identifying and differentiating between **harmful** and **non-harmful** content, this model supports the creation of a **safer** and more **inclusive digital environment**.
|
109 |
|
110 |
-
##
|
111 |
|
112 |
While the model demonstrates significant improvements over the previous version, **content moderation** is an ongoing challenge. Further refinements are always possible to improve its accuracy and effectiveness in diverse contexts and improved versions are expected in the near future.
|
113 |
|
114 |
-
##
|
115 |
|
116 |
To assess the model’s performance, I selected 23 examples representing various types of hate speech and non-hate speech, covering categories such as homophobia, racism, sexism, and transphobia. These examples were carefully chosen from outside the datasets the model was trained or evaluated on, providing a comprehensive test of the model’s ability to generalize and handle real-world data.
|
117 |
|
@@ -286,7 +293,7 @@ Hate Speech: 0.02%
|
|
286 |
```
|
287 |
</details>
|
288 |
|
289 |
-
##
|
290 |
|
291 |
It achieves the following results on the *evaluation set* (last epoch):
|
292 |
- 'eval_loss': 0.3601696193218231
|
@@ -302,7 +309,7 @@ It achieves the following results on the *evaluation set* (last epoch):
|
|
302 |
- 'eval_steps_per_second': 30.681
|
303 |
- 'epoch': 6.0
|
304 |
|
305 |
-
##
|
306 |
|
307 |
### Main Hyperparameters:
|
308 |
|
@@ -323,14 +330,14 @@ The following hyperparameters were used during training:
|
|
323 |
- metric_for_best_model: "eval_loss"
|
324 |
- greater_is_better: False
|
325 |
|
326 |
-
##
|
327 |
|
328 |
- Transformers 4.47.1
|
329 |
- PyTorch version 2.5.1+cu121
|
330 |
- Datasets version 3.2.0
|
331 |
- Tokenizers version 0.21.0
|
332 |
|
333 |
-
##
|
334 |
|
335 |
- **manueltonneau/spanish-hate-speech-superset**:
|
336 |
|
@@ -386,7 +393,7 @@ For additional information about the dataset, refer to the original [repository]
|
|
386 |
```
|
387 |
Please, if you use this model, do not forget to include my citation. Thank you!
|
388 |
|
389 |
-
##
|
390 |
|
391 |
This model was fine-tuned and optimized by **Javier de la Rosa Sánchez**, applying state-of-the-art techniques to enhance its performance for hate speech detection in Spanish.
|
392 |
|
|
|
35 |
|
36 |
- **Target**: Racism, homophobia, sexism, transphobia and other forms of discrimination.
|
37 |
|
38 |
+
## 2. Try it out:
|
39 |
+
|
40 |
+
You can interact with the model directly through the [Inference Endpoint](https://huggingface.co/spaces/delarosajav95/HateSpeech-BETO-cased-v2):
|
41 |
+
|
42 |
+
[![Open Inference Endpoint](https://img.shields.io/badge/Open_Inference_Endpoint-blue)](https://huggingface.co/spaces/delarosajav95/HateSpeech-BETO-cased-v2)
|
43 |
+
|
44 |
+
|
45 |
+
## 3. Key Enhancements in v2:
|
46 |
|
47 |
- **Previous Version (v1)**: Fine-tuned on the [**Paul/hatecheck-spanish**](https://huggingface.co/datasets/Paul/hatecheck-spanish) dataset, but real-world testing revealed performance issues, limiting its effectiveness.
|
48 |
|
|
|
53 |
- **Incorporation of Paul Samples**: After evaluating the results, it was clear that including key samples from the **Paul dataset** would help the model capture additional nuanced forms of hate speech, such as **transphobia** and **multiple types of racism**.
|
54 |
- A significant amount of effort went into carefully selecting and processing these samples from the Paul dataset and integrating them with the **manueltonneau** dataset. This careful curation created a more comprehensive dataset, **enhancing the model's ability to differentiate between hate and non-hate speech**.
|
55 |
|
56 |
+
## 4. Preprocessing and Postprocessing:
|
57 |
|
58 |
To prepare the datasets for fine-tuning and ensure optimal model performance, the following steps were undertaken:
|
59 |
|
|
|
96 |
- Applied dynamic padding using the Hugging Face DataCollator to handle varying text lengths efficiently.
|
97 |
- Batch settings: batch_size=8, shuffle=True.
|
98 |
|
99 |
+
## 5. Performance Improvements:
|
100 |
|
101 |
- **Greater Accuracy**: The inclusion of diverse samples led to a more balanced model that can better handle different forms of discrimination.
|
102 |
- **Precision in Detecting Non-Hate Speech**: The model is now more reliable at detecting non-hateful content, minimizing false positives.
|
103 |
- **Robustness**: The updated model performs better in real-world scenarios, offering stronger results for content moderation tasks.
|
104 |
|
105 |
+
## 6. Use Case:
|
106 |
|
107 |
- This model is optimized for content moderation on online platforms, where it can detect harmful speech and help foster safer online environments.
|
108 |
- **Classification Task**: The model categorizes text into two labels:
|
|
|
110 |
- **Non-Hateful (LABEL_0)**: Content that does not contain hate speech and is neutral or constructive.
|
111 |
- **Hateful (LABEL_1)**: Content that promotes hate speech or harmful rhetoric.
|
112 |
|
113 |
+
## 7. Goal:
|
114 |
|
115 |
The **goal** of the model is to identify content that promotes **harmful rhetoric** or **behavior**, while distinguishing it from **neutral** or **constructive speech**. This makes it highly applicable for **moderating online content**, ensuring that **harmful speech** and **behavior** are flagged while maintaining the integrity of **non-hateful communication**. By accurately identifying and differentiating between **harmful** and **non-harmful** content, this model supports the creation of a **safer** and more **inclusive digital environment**.
|
116 |
|
117 |
+
## 8. Future Work:
|
118 |
|
119 |
While the model demonstrates significant improvements over the previous version, **content moderation** is an ongoing challenge. Further refinements are always possible to improve its accuracy and effectiveness in diverse contexts and improved versions are expected in the near future.
|
120 |
|
121 |
+
## 9. Full classification example in Pyhton:
|
122 |
|
123 |
To assess the model’s performance, I selected 23 examples representing various types of hate speech and non-hate speech, covering categories such as homophobia, racism, sexism, and transphobia. These examples were carefully chosen from outside the datasets the model was trained or evaluated on, providing a comprehensive test of the model’s ability to generalize and handle real-world data.
|
124 |
|
|
|
293 |
```
|
294 |
</details>
|
295 |
|
296 |
+
## 10. Metrics and results:
|
297 |
|
298 |
It achieves the following results on the *evaluation set* (last epoch):
|
299 |
- 'eval_loss': 0.3601696193218231
|
|
|
309 |
- 'eval_steps_per_second': 30.681
|
310 |
- 'epoch': 6.0
|
311 |
|
312 |
+
## 11. Training Details and Procedure:
|
313 |
|
314 |
### Main Hyperparameters:
|
315 |
|
|
|
330 |
- metric_for_best_model: "eval_loss"
|
331 |
- greater_is_better: False
|
332 |
|
333 |
+
## 12. Framework versions:
|
334 |
|
335 |
- Transformers 4.47.1
|
336 |
- PyTorch version 2.5.1+cu121
|
337 |
- Datasets version 3.2.0
|
338 |
- Tokenizers version 0.21.0
|
339 |
|
340 |
+
## 13. CITATION:
|
341 |
|
342 |
- **manueltonneau/spanish-hate-speech-superset**:
|
343 |
|
|
|
393 |
```
|
394 |
Please, if you use this model, do not forget to include my citation. Thank you!
|
395 |
|
396 |
+
## 14. Authorship and Contact Information:
|
397 |
|
398 |
This model was fine-tuned and optimized by **Javier de la Rosa Sánchez**, applying state-of-the-art techniques to enhance its performance for hate speech detection in Spanish.
|
399 |
|