omykhailiv
/

bert-fake-news-recognition

@@ -1,11 +1,13 @@
 ---
 library_name: transformers
-license: apache-2.0
 language:
 - en
 metrics:
 - accuracy
 pipeline_tag: text-classification
 ---
 # Model Card for Model ID
@@ -34,11 +36,9 @@ This model can be used for whatever reason you need, also a site hosted, based o
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
-As a Bert model, this also has bias. It can't be considered as a somewhat state-of-the-art model, because
-it was trained on old data (about 2022 and older), so it may not be considered as a reliable fake-news checker
-about military conflicts in Ukraine, Israel, and so on. Please consider, that the names of people in the data were not preprocessed, so
-it might also be biased toward certain names.
 ### Recommendations
 <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
@@ -56,7 +56,7 @@ pipe.predict('Some text')
 It will return something like this:
 [{'label': 'LABEL_0', 'score': 0.7248537290096283}]
-Where 'LABEL_0' means false and score means the probability of it.
 ### Training Data
@@ -65,7 +65,8 @@ https://huggingface.co/datasets/GonzaloA/fake_news
 https://github.com/GeorgeMcIntire/fake_real_news_dataset
 #### Preprocessing
-Preprocessing was made by using this function:
 ```
 import re
 import string
@@ -124,13 +125,6 @@ The following hyperparameters were used during training:
  - weight_decay: 0.03
  - random seed: 42
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ### Testing Data, Metrics
 #### Testing Data
@@ -184,4 +178,4 @@ weighted avg     0.9731    0.9731    0.9731     19996
 #### Hardware
-Tesla T4 GPU, available for free in Google Collab

 ---
 library_name: transformers
+license: mit
 language:
 - en
 metrics:
 - accuracy
 pipeline_tag: text-classification
+tags:
+- fake news
 ---
 # Model Card for Model ID
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
+Since it's a Bert model, it also exhibits bias. It wouldn't be classified as cutting-edge because it was trained
+on outdated data (pre-2022). This makes it unreliable for fact-checking fake news related to military conflicts in Ukraine,
+Israel, etc.  Additionally, the lack of preprocessing for people's names in the data might introduce a bias towards certain names.
 ### Recommendations
 <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 It will return something like this:
 [{'label': 'LABEL_0', 'score': 0.7248537290096283}]
+Where 'LABEL_0' means false and 'score' stands for the probability of it.
 ### Training Data
 https://github.com/GeorgeMcIntire/fake_real_news_dataset
 #### Preprocessing
+Preprocessing was made by using this function. Note that the data, tested below, was not truncated to
+12 >= len(new_filtered_words) >= 6, but it has still been pre-processed.
 ```
 import re
 import string
  - weight_decay: 0.03
  - random seed: 42
 ### Testing Data, Metrics
 #### Testing Data
 #### Hardware
+Tesla T4 GPU, available for free in Google Collab