File size: 9,513 Bytes
53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 24f0d11 c3fe027 53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 53602a1 c3fe027 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
---
widget:
- text: A family hiking in the mountains
example_title: Safe
- text: A child playing with a puppy
example_title: Safe
- text: A couple kissing passionately in bed
example_title: Nsfw
- text: A woman naked
example_title: Nsfw
- text: A man killing people
example_title: Nsfw
- text: A mass shooting
example_title: Nsfw
license: apache-2.0
language:
- en
metrics:
- f1
- accuracy
- precision
- recall
pipeline_tag: text-classification
tags:
- Transformers
- ' PyTorch'
- safety
- innapropriate
- distilroberta
datasets:
- eliasalbouzidi/NSFW-Safe-Dataset
model-index:
- name: NSFW-Safe-Dataset
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: NSFW-Safe-Dataset
type: .
metrics:
- name: F1
type: f1
value: 0.975
- name: Accuracy
type: accuracy
value: 0.981
---
# Model Card
This model is designed to categorize text into two classes: "safe", or "nsfw" (not safe for work), which makes it suitable for content moderation and filtering applications.
The model was trained using a dataset containing 190,000 labeled text samples, distributed among the two classes of "safe" and "nsfw".
The model is based on the Distilroberta-base model.
In terms of performance, the model has achieved a score of 0.975 for F1 (40K exemples).
To improve the performance of the model, it is necessary to preprocess the input text. You can refer to the preprocess function in the app.py file in the following space: <https://huggingface.co/spaces/eliasalbouzidi/distilbert-nsfw-text-classifier>.
### Model Description
The model can be used directly to classify text into one of the two classes. It takes in a string of text as input and outputs a probability distribution over the two classes. The class with the highest probability is selected as the predicted class.
- **Developed by:** Centrale Supélec Students
- **Model type:** 82M
- **Language(s) (NLP):** English
- **License:** apache-2.0
### Uses
The model can be integrated into larger systems for content moderation or filtering.
### Training Data
The training data for finetuning the text classification model consists of a large corpus of text labeled with one of the two classes: "safe" and "nsfw". The dataset contains a total of 190,000 examples, which are distributed as follows:
117,000 examples labeled as "safe"
63,000 examples labeled as "nsfw"
It was assembled by scraping data from the web and utilizing existing open-source datasets. A significant portion of the dataset consists of descriptions for images and scenes. The primary objective was to prevent diffusers from generating NSFW content but it can be used for other moderation purposes.
You can access the dataset : https://huggingface.co/datasets/eliasalbouzidi/NSFW-Safe-Dataset
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 600
- num_epochs: 3
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Fbeta 1.6 | False positive rate | False negative rate | Precision | Recall |
|:-------------:|:------:|:-----:|:---------------:|:--------:|:------:|:---------:|:-------------------:|:-------------------:|:---------:|:------:|
| 0.3325 | 0.0998 | 586 | 0.1839 | 0.9479 | 0.9280 | 0.9132 | 0.0202 | 0.1052 | 0.9638 | 0.8948 |
| 0.1161 | 0.1997 | 1172 | 0.0958 | 0.9682 | 0.9576 | 0.9577 | 0.0256 | 0.0422 | 0.9574 | 0.9578 |
| 0.1006 | 0.2995 | 1758 | 0.0959 | 0.9716 | 0.9616 | 0.9554 | 0.0140 | 0.0524 | 0.9761 | 0.9476 |
| 0.0984 | 0.3994 | 2344 | 0.0776 | 0.9717 | 0.9628 | 0.9674 | 0.0293 | 0.0265 | 0.9523 | 0.9735 |
| 0.0897 | 0.4992 | 2930 | 0.0994 | 0.9676 | 0.9580 | 0.9696 | 0.0428 | 0.0151 | 0.9326 | 0.9849 |
| 0.0856 | 0.5991 | 3516 | 0.0889 | 0.9751 | 0.9670 | 0.9684 | 0.0219 | 0.0299 | 0.9638 | 0.9701 |
| 0.0779 | 0.6989 | 4102 | 0.0842 | 0.9762 | 0.9681 | 0.9652 | 0.0149 | 0.0385 | 0.9749 | 0.9615 |
| 0.0821 | 0.7988 | 4688 | 0.0702 | 0.9768 | 0.9693 | 0.9723 | 0.0228 | 0.0239 | 0.9626 | 0.9761 |
| 0.073 | 0.8986 | 5274 | 0.0773 | 0.9776 | 0.9703 | 0.9727 | 0.0213 | 0.0243 | 0.9650 | 0.9757 |
| 0.0775 | 0.9985 | 5860 | 0.0709 | 0.9774 | 0.9701 | 0.9721 | 0.0210 | 0.0252 | 0.9653 | 0.9748 |
| 0.0627 | 1.0983 | 6446 | 0.0646 | 0.9789 | 0.9720 | 0.9737 | 0.0193 | 0.0240 | 0.9681 | 0.9760 |
| 0.0648 | 1.1982 | 7032 | 0.0729 | 0.9787 | 0.9716 | 0.9708 | 0.0158 | 0.0303 | 0.9735 | 0.9697 |
| 0.0592 | 1.2980 | 7618 | 0.0733 | 0.9802 | 0.9735 | 0.9725 | 0.0144 | 0.0289 | 0.9760 | 0.9711 |
| 0.0585 | 1.3979 | 8204 | 0.0764 | 0.9790 | 0.9720 | 0.9723 | 0.0173 | 0.0273 | 0.9713 | 0.9727 |
| 0.0579 | 1.4977 | 8790 | 0.0691 | 0.9782 | 0.9712 | 0.9739 | 0.0214 | 0.0225 | 0.9649 | 0.9775 |
| 0.0584 | 1.5975 | 9376 | 0.0739 | 0.9797 | 0.9732 | 0.9755 | 0.0195 | 0.0215 | 0.9679 | 0.9785 |
| 0.0564 | 1.6974 | 9962 | 0.0749 | 0.9774 | 0.9703 | 0.9757 | 0.0257 | 0.0173 | 0.9583 | 0.9827 |
| 0.0582 | 1.7972 | 10548 | 0.0721 | 0.9804 | 0.9739 | 0.9742 | 0.0162 | 0.0253 | 0.9730 | 0.9747 |
| 0.0576 | 1.8971 | 11134 | 0.0746 | 0.9799 | 0.9732 | 0.9734 | 0.0163 | 0.0264 | 0.9729 | 0.9736 |
| 0.0546 | 1.9969 | 11720 | 0.0758 | 0.9804 | 0.9739 | 0.9736 | 0.0152 | 0.0269 | 0.9747 | 0.9731 |
| 0.0431 | 2.0968 | 12306 | 0.0755 | 0.9805 | 0.9741 | 0.9762 | 0.0186 | 0.0211 | 0.9694 | 0.9789 |
| 0.0464 | 2.1966 | 12892 | 0.0785 | 0.9802 | 0.9737 | 0.9735 | 0.0156 | 0.0266 | 0.9740 | 0.9734 |
| 0.0443 | 2.2965 | 13478 | 0.0763 | 0.9811 | 0.9748 | 0.9734 | 0.0132 | 0.0283 | 0.9779 | 0.9717 |
| 0.0426 | 2.3963 | 14064 | 0.0753 | 0.9812 | 0.9750 | 0.9760 | 0.0165 | 0.0227 | 0.9727 | 0.9773 |
| 0.0413 | 2.4962 | 14650 | 0.0750 | 0.9811 | 0.9748 | 0.9760 | 0.0168 | 0.0225 | 0.9722 | 0.9775 |
| 0.0442 | 2.5960 | 15236 | 0.0756 | 0.9813 | 0.9752 | 0.9766 | 0.0169 | 0.0216 | 0.9720 | 0.9784 |
| 0.043 | 2.6959 | 15822 | 0.0810 | 0.9814 | 0.9750 | 0.9729 | 0.0119 | 0.0299 | 0.98 | 0.9701 |
| 0.0433 | 2.7957 | 16408 | 0.0783 | 0.9814 | 0.9751 | 0.9733 | 0.0125 | 0.0289 | 0.9790 | 0.9711 |
| 0.0398 | 2.8956 | 16994 | 0.0736 | 0.9814 | 0.9752 | 0.9766 | 0.0169 | 0.0216 | 0.9721 | 0.9784 |
| 0.0431 | 2.9954 | 17580 | 0.0757 | 0.9816 | 0.9754 | 0.9757 | 0.0151 | 0.0240 | 0.9749 | 0.9760 |
We selected the checkpoint with the highest F-beta1.6 score.
### Framework versions
- Transformers 4.40.1
- Pytorch 2.3.0+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1
### Out-of-Scope Use
It should not be used for any illegal activities.
## Bias, Risks, and Limitations
The model may exhibit biases based on the training data used. It may not perform well on text that is written in languages other than English. It may also struggle with sarcasm, irony, or other forms of figurative language. The model may produce false positives or false negatives, which could lead to incorrect categorization of text.
### Recommendations
Users should be aware of the limitations and biases of the model and use it accordingly. They should also be prepared to handle false positives and false negatives. It is recommended to fine-tune the model for specific downstream tasks and to evaluate its performance on relevant datasets.
### Load model directly
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("eliasalbouzidi/distilroberta-nsfw-text-classifier")
model = AutoModelForSequenceClassification.from_pretrained("eliasalbouzidi/distilroberta-nsfw-text-classifier")
```
### Use a pipeline
```python
from transformers import pipeline
pipe = pipeline("text-classification", model="eliasalbouzidi/distilroberta-nsfw-text-classifier")
```
## Contact
Please reach out to [email protected] if you have any questions or feedback. |