BEiT-Large for Emotion Detection on AffectNet
This model is fine-tuned from microsoft/beit-large-patch16-224-pt22k-ft22k
for facial emotion recognition using a cleaned and balanced version of the AffectNet dataset.
π§ Classes
The model predicts the following 7 basic emotion classes:
- π anger
- π€’ disgust
- π¨ fear
- π happy
- π neutral
- π’ sad
- π² surprise
π Dataset Overview
Emotion | Train Samples | Test Samples |
---|---|---|
anger | 1500 | 1718 |
disgust | 1229 | 1248 |
fear | 1512 | 1664 |
happy | 2340 | 2704 |
neutral | 2758 | 2368 |
sad | 3091 | 1584 |
surprise | 2119 | 1920 |
π Training Metrics
Epoch | Training Loss | Validation Loss | Accuracy |
---|---|---|---|
1 | 0.4552 | 0.5809 | 0.6917 |
2 | 0.3000 | 0.6669 | 0.7079 |
3 | 0.1473 | 0.7098 | 0.7378 |
4 | 0.0674 | 0.8904 | 0.7353 |
5 | 0.0291 | 0.9008 | 0.7452 |
6 | 0.0216 | 0.9844 | 0.7503 |
7 | 0.0118 | 1.0369 | 0.7522 |
8 | 0.0069 | 1.0992 | 0.7486 |
9 | 0.0035 | 1.0947 | 0.7482 |
10 | 0.0023 | 1.1336 | 0.7461 |
β Final Accuracy: ~74.6% on the test set
Training Configuration
The model was trained using the Hugging Face Trainer
with the following main arguments:
num_train_epochs=10
per_device_train_batch_size=64
per_device_eval_batch_size=64
gradient_accumulation_steps=2
learning_rate=5e-5
fp16=True
(mixed precision training)eval_strategy="epoch"
save_strategy="epoch"
save_total_limit=2
load_best_model_at_end=True
metric_for_best_model="accuracy"
Confusion Matrix
π§ How to Use
from transformers import BeitImageProcessor, BeitForImageClassification
from PIL import Image
import requests
image_path = '/RAF-DB/aligned/test_0031_aligned.jpg' # β¬
οΈ Replace with your image path
image = Image.open(image_path).convert("RGB")
processor = BeitImageProcessor.from_pretrained("Tanneru/Facial-Emotion-Detection-BEIT-Large")
model = BeitForImageClassification.from_pretrained("Tanneru/Facial-Emotion-Detection-BEIT-Large")
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])
π License
This model is released under the Apache 2.0 License. You are free to use, modify, and distribute the model with attribution.
βοΈ Author
- Username: Tanneru
- Base model:
microsoft/beit-large-patch16-224-pt22k-ft22k
π Citation
If you use this model in your work, please cite:
@misc{tanneru2025beit_affectnet,
title={BEiT-Large fine-tuned on AffectNet for Emotion Detection},
author={Tanneru},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/Tanneru/Facial-Emotion-Detection-BEIT-Large}},
}
@article{bao2021beit,
author = {Hangbo Bao and Li Dong and Furu Wei},
title = {BEiT: BERT Pre-Training of Image Transformers},
journal = {CoRR},
volume = {abs/2106.08254},
year = {2021},
url = {https://arxiv.org/abs/2106.08254},
archivePrefix = {arXiv},
eprint = {2106.08254},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
- Downloads last month
- 57