omykhailiv commited on
Commit
01a9c48
·
verified ·
1 Parent(s): 5aa20c7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -16
README.md CHANGED
@@ -1,11 +1,13 @@
1
  ---
2
  library_name: transformers
3
- license: apache-2.0
4
  language:
5
  - en
6
  metrics:
7
  - accuracy
8
  pipeline_tag: text-classification
 
 
9
  ---
10
 
11
  # Model Card for Model ID
@@ -34,11 +36,9 @@ This model can be used for whatever reason you need, also a site hosted, based o
34
  ## Bias, Risks, and Limitations
35
 
36
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
37
- As a Bert model, this also has bias. It can't be considered as a somewhat state-of-the-art model, because
38
- it was trained on old data (about 2022 and older), so it may not be considered as a reliable fake-news checker
39
- about military conflicts in Ukraine, Israel, and so on. Please consider, that the names of people in the data were not preprocessed, so
40
- it might also be biased toward certain names.
41
-
42
  ### Recommendations
43
 
44
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
@@ -56,7 +56,7 @@ pipe.predict('Some text')
56
 
57
  It will return something like this:
58
  [{'label': 'LABEL_0', 'score': 0.7248537290096283}]
59
- Where 'LABEL_0' means false and score means the probability of it.
60
 
61
  ### Training Data
62
 
@@ -65,7 +65,8 @@ https://huggingface.co/datasets/GonzaloA/fake_news
65
  https://github.com/GeorgeMcIntire/fake_real_news_dataset
66
 
67
  #### Preprocessing
68
- Preprocessing was made by using this function:
 
69
  ```
70
  import re
71
  import string
@@ -124,13 +125,6 @@ The following hyperparameters were used during training:
124
  - weight_decay: 0.03
125
  - random seed: 42
126
 
127
- #### Speeds, Sizes, Times [optional]
128
-
129
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
130
-
131
- [More Information Needed]
132
-
133
-
134
  ### Testing Data, Metrics
135
 
136
  #### Testing Data
@@ -184,4 +178,4 @@ weighted avg 0.9731 0.9731 0.9731 19996
184
 
185
  #### Hardware
186
 
187
- Tesla T4 GPU, available for free in Google Collab
 
1
  ---
2
  library_name: transformers
3
+ license: mit
4
  language:
5
  - en
6
  metrics:
7
  - accuracy
8
  pipeline_tag: text-classification
9
+ tags:
10
+ - fake news
11
  ---
12
 
13
  # Model Card for Model ID
 
36
  ## Bias, Risks, and Limitations
37
 
38
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
39
+ Since it's a Bert model, it also exhibits bias. It wouldn't be classified as cutting-edge because it was trained
40
+ on outdated data (pre-2022). This makes it unreliable for fact-checking fake news related to military conflicts in Ukraine,
41
+ Israel, etc. Additionally, the lack of preprocessing for people's names in the data might introduce a bias towards certain names.
 
 
42
  ### Recommendations
43
 
44
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 
56
 
57
  It will return something like this:
58
  [{'label': 'LABEL_0', 'score': 0.7248537290096283}]
59
+ Where 'LABEL_0' means false and 'score' stands for the probability of it.
60
 
61
  ### Training Data
62
 
 
65
  https://github.com/GeorgeMcIntire/fake_real_news_dataset
66
 
67
  #### Preprocessing
68
+ Preprocessing was made by using this function. Note that the data, tested below, was not truncated to
69
+ 12 >= len(new_filtered_words) >= 6, but it has still been pre-processed.
70
  ```
71
  import re
72
  import string
 
125
  - weight_decay: 0.03
126
  - random seed: 42
127
 
 
 
 
 
 
 
 
128
  ### Testing Data, Metrics
129
 
130
  #### Testing Data
 
178
 
179
  #### Hardware
180
 
181
+ Tesla T4 GPU, available for free in Google Collab