Emotion Multilabel Classification Model
π A fine-tuned DistilBERT model for multilabel emotion classification on text data.
Model Description
This model is fine-tuned from distilbert-base-uncased
for multilabel emotion classification. It can predict multiple emotions simultaneously from text input across 14 different emotion categories.
Performance
Kaggle Competition Results
- π Kaggle Test Score (Macro F1): 0.4214
- π Validation Score (Macro F1): 0.4275
- π― Generalization Gap: 0.0061 (excellent!)
Detailed Metrics
- Macro F1-Score: 0.4275 (validation) / 0.4214 (test)
- Micro F1-Score: 0.4226
- Hamming Loss: 0.1816
- Jaccard Score: 0.2679
Model Architecture
- Base Model: distilbert-base-uncased
- Parameters: 66,373,646 (~66M)
- Architecture: DistilBERT + Custom Classification Head
- Dropout: 0.3
- Loss Function: BCEWithLogitsLoss with class weights
Emotions Supported
The model can predict the following 14 emotions: ['amusement', 'anger', 'annoyance', 'caring', 'confusion', 'disappointment', 'disgust', 'embarrassment', 'excitement', 'fear', 'gratitude', 'joy', 'love', 'sadness']
Usage
from transformers import AutoTokenizer, AutoModel
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("your-username/emotion-multilabel-distilbert")
model = AutoModel.from_pretrained("your-username/emotion-multilabel-distilbert")
# Example usage
text = "I'm so happy and excited about this project!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.sigmoid(outputs.logits)
# Get emotions above threshold (0.5)
emotions = []
for i, prob in enumerate(predictions[0]):
if prob > 0.5:
emotions.append(emotion_columns[i])
print(f"Predicted emotions: {', '.join(emotions)}")
Training Details
- Training Duration: ~13 minutes
- Hardware: Tesla T4 GPU
- Epochs: 3
- Batch Size: 16
- Learning Rate: 2e-5
- Max Sequence Length: 128
- Optimizer: AdamW
- Class Weights: Applied for imbalanced dataset
Dataset Statistics
- Training Samples: 37,164
- Validation Samples: 9,291
- Test Samples: 8,199
- Average Labels per Sample: 3.21
- Most Common Pattern: 2-4 emotions per text
Performance Analysis
Strengths
- β Good generalization (small validation-test gap)
- β Reasonable multilabel predictions (avg 3.21 labels)
- β Fast inference (~9 iterations/second)
- β Memory efficient (66M parameters)
Areas for Improvement
- π Macro F1 could be improved with hyperparameter tuning
- π Class imbalance handling could be optimized
- π Ensemble methods could boost performance
License
This model is released under the MIT License.
- Downloads last month
- 4
Evaluation results
- Macro F1-Score (Kaggle Test) on Emotion Classification Datasetself-reported0.421
- Macro F1-Score (Validation) on Emotion Classification Datasetself-reported0.427
- Micro F1-Score on Emotion Classification Datasetself-reported0.423
- Hamming Loss on Emotion Classification Datasetself-reported0.182
- Jaccard Score on Emotion Classification Datasetself-reported0.268