SST-2 Demo: DistilBERT fine-tuned on 5% of SST-2
This model is a minimal demonstration of fine-tuning distilbert-base-uncased
on the SST-2 (Stanford Sentiment Treebank v2) dataset.
It was trained for 1 epoch on 5% of the training set (~3,300 examples) using a consumer GPU (RTX 4060) and mixed-precision (fp16=True
). The model achieves ~86% validation accuracy in under 10 minutes.
π Evaluation
This model was evaluated on the SST-2 validation set after training for one epoch on 5% of the training data (~3,300 examples).
Metric | Value |
---|---|
Accuracy | ~86% |
Loss | ~0.35 |
Evaluated using the Hugging Face
Trainer
's default compute_metrics function. These results are not representative of full training performance due to the limited data and short training schedule.
Not intended for production use β this version is trained on a tiny subset of the dataset.
ποΈ Training details
- Base model: distilbert-base-uncased
- Dataset: GLUE/SST-2 (5%)
- Hardware: NVIDIA RTX 4060 Laptop GPU
- Batch size: 32
- Epochs: 1
- Precision: mixed-precision (
fp16
) - Trainer: Hugging Face
Trainer
π Files
pytorch_model.bin
β model weightsconfig.json
β architecture detailstokenizer.json
,vocab.txt
, etc. β tokenizer files from distilbert-base-uncased
βοΈ Author
This model was fine-tuned and published by @akryshtal as part of a machine learning engineering demo project.
- Downloads last month
- 1