SST-2 Demo: DistilBERT fine-tuned on 5% of SST-2

This model is a minimal demonstration of fine-tuning distilbert-base-uncased on the SST-2 (Stanford Sentiment Treebank v2) dataset.

It was trained for 1 epoch on 5% of the training set (~3,300 examples) using a consumer GPU (RTX 4060) and mixed-precision (fp16=True). The model achieves ~86% validation accuracy in under 10 minutes.


πŸ“Š Evaluation

This model was evaluated on the SST-2 validation set after training for one epoch on 5% of the training data (~3,300 examples).

Metric Value
Accuracy ~86%
Loss ~0.35

Evaluated using the Hugging Face Trainer's default compute_metrics function. These results are not representative of full training performance due to the limited data and short training schedule.


Not intended for production use β€” this version is trained on a tiny subset of the dataset.


πŸ—ƒοΈ Training details

  • Base model: distilbert-base-uncased
  • Dataset: GLUE/SST-2 (5%)
  • Hardware: NVIDIA RTX 4060 Laptop GPU
  • Batch size: 32
  • Epochs: 1
  • Precision: mixed-precision (fp16)
  • Trainer: Hugging Face Trainer

πŸ“Ž Files

  • pytorch_model.bin β€” model weights
  • config.json β€” architecture details
  • tokenizer.json, vocab.txt, etc. β€” tokenizer files from distilbert-base-uncased

✍️ Author

This model was fine-tuned and published by @akryshtal as part of a machine learning engineering demo project.

Downloads last month
1
Safetensors
Model size
67M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for akryshtal/sst2-demo

Dataset used to train akryshtal/sst2-demo