Spaces:
Sleeping
Sleeping
metadata
title: Fine-Tuned BERT Model
emoji: 🌍
colorFrom: blue
colorTo: purple
sdk: docker
pinned: true
Fine-Tuned BERT Model for Climate Disinformation Classification
Model Description
This is a fine-tuned BERT model trained for the Frugal AI Challenge 2024. The model has been fine-tuned on the climate disinformation dataset to classify text inputs into 8 distinct categories related to climate disinformation. It leverages BERT's pretrained language understanding capabilities and has been optimized for accuracy in this domain.
Intended Use
- Primary intended uses: Classifying text inputs to detect specific claims of climate disinformation
- Primary intended users: Researchers, developers, and participants in the Frugal AI Challenge
- Out-of-scope use cases: Not recommended for tasks outside climate disinformation classification or production-level applications without further evaluation
Training Data
The model uses the QuotaClimat/frugalaichallenge-text-train dataset:
- Size: ~6000 examples
- Split: 80% train, 20% test
- 8 categories of climate disinformation claims
Labels
- No relevant claim detected
- Global warming is not happening
- Not caused by humans
- Not bad or beneficial
- Solutions harmful/unnecessary
- Science is unreliable
- Proponents are biased
- Fossil fuels are needed
Performance
Metrics
- Accuracy: Achieved XX.X% on the test set (replace
XX.X%
with the actual accuracy from your evaluation) - Environmental Impact:
- Carbon emissions tracked in gCO2eq
- Energy consumption tracked in Wh
Model Architecture
This model fine-tunes the BERT base architecture (bert-base-uncased
) for the climate disinformation task. The classifier head includes:
- Dense layers
- Dropout for regularization
- Softmax activation for multi-class classification
Environmental Impact
Environmental impact is tracked using CodeCarbon, measuring:
- Carbon emissions during inference and training
- Energy consumption during inference and training
This tracking aligns with the Frugal AI Challenge's commitment to promoting sustainable AI practices.
Limitations
- Fine-tuned specifically for climate disinformation; performance on other text classification tasks may degrade
- Requires computational resources (e.g., GPU) for efficient inference
- Predictions rely on the training dataset's representativeness; may struggle with unseen or out-of-distribution data
Ethical Considerations
- Dataset contains sensitive topics related to climate disinformation
- Model performance depends on the quality of the dataset and annotation biases
- Environmental impact during training and inference is disclosed to encourage awareness of AI's carbon footprint
- Users must validate outputs before using in sensitive or high-stakes applications