submission-template / README.md
VanshK04's picture
Update README.md
a02d840 verified
metadata
title: Fine-Tuned BERT Model
emoji: 🌍
colorFrom: blue
colorTo: purple
sdk: docker
pinned: true

Fine-Tuned BERT Model for Climate Disinformation Classification

Model Description

This is a fine-tuned BERT model trained for the Frugal AI Challenge 2024. The model has been fine-tuned on the climate disinformation dataset to classify text inputs into 8 distinct categories related to climate disinformation. It leverages BERT's pretrained language understanding capabilities and has been optimized for accuracy in this domain.

Intended Use

  • Primary intended uses: Classifying text inputs to detect specific claims of climate disinformation
  • Primary intended users: Researchers, developers, and participants in the Frugal AI Challenge
  • Out-of-scope use cases: Not recommended for tasks outside climate disinformation classification or production-level applications without further evaluation

Training Data

The model uses the QuotaClimat/frugalaichallenge-text-train dataset:

  • Size: ~6000 examples
  • Split: 80% train, 20% test
  • 8 categories of climate disinformation claims

Labels

  1. No relevant claim detected
  2. Global warming is not happening
  3. Not caused by humans
  4. Not bad or beneficial
  5. Solutions harmful/unnecessary
  6. Science is unreliable
  7. Proponents are biased
  8. Fossil fuels are needed

Performance

Metrics

  • Accuracy: Achieved XX.X% on the test set (replace XX.X% with the actual accuracy from your evaluation)
  • Environmental Impact:
    • Carbon emissions tracked in gCO2eq
    • Energy consumption tracked in Wh

Model Architecture

This model fine-tunes the BERT base architecture (bert-base-uncased) for the climate disinformation task. The classifier head includes:

  • Dense layers
  • Dropout for regularization
  • Softmax activation for multi-class classification

Environmental Impact

Environmental impact is tracked using CodeCarbon, measuring:

  • Carbon emissions during inference and training
  • Energy consumption during inference and training

This tracking aligns with the Frugal AI Challenge's commitment to promoting sustainable AI practices.

Limitations

  • Fine-tuned specifically for climate disinformation; performance on other text classification tasks may degrade
  • Requires computational resources (e.g., GPU) for efficient inference
  • Predictions rely on the training dataset's representativeness; may struggle with unseen or out-of-distribution data

Ethical Considerations

  • Dataset contains sensitive topics related to climate disinformation
  • Model performance depends on the quality of the dataset and annotation biases
  • Environmental impact during training and inference is disclosed to encourage awareness of AI's carbon footprint
  • Users must validate outputs before using in sensitive or high-stakes applications