Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
license: apache-2.0
|
4 |
+
tags:
|
5 |
+
- text-classification
|
6 |
+
- sentiment
|
7 |
+
- distilbert
|
8 |
+
- sst2
|
9 |
+
- fine-tuning
|
10 |
+
- transformers
|
11 |
+
- huggingface
|
12 |
+
pipeline_tag: text-classification
|
13 |
+
widget:
|
14 |
+
- text: I really enjoyed this movie!
|
15 |
+
- text: This was an absolute waste of time.
|
16 |
+
datasets:
|
17 |
+
- gimmaru/glue-sst2
|
18 |
+
base_model:
|
19 |
+
- distilbert/distilbert-base-uncased-finetuned-sst-2-english
|
20 |
+
---
|
21 |
+
|
22 |
+
# SST-2 Demo: DistilBERT fine-tuned on 5% of SST-2
|
23 |
+
|
24 |
+
This model is a minimal demonstration of fine-tuning [`distilbert-base-uncased`](https://huggingface.co/distilbert-base-uncased) on the [SST-2 (Stanford Sentiment Treebank v2)](https://huggingface.co/datasets/glue/viewer/sst2/train) dataset.
|
25 |
+
|
26 |
+
It was trained for **1 epoch on 5% of the training set (\~3,300 examples)** using a consumer GPU (RTX 4060) and mixed-precision (`fp16=True`). The model achieves \~86% validation accuracy in under 10 minutes.
|
27 |
+
|
28 |
+
---
|
29 |
+
|
30 |
+
## π Evaluation
|
31 |
+
|
32 |
+
This model was evaluated on the SST-2 validation set after training for one epoch on 5% of the training data (~3,300 examples).
|
33 |
+
|
34 |
+
| Metric | Value |
|
35 |
+
|------------|-------|
|
36 |
+
| Accuracy | ~86% |
|
37 |
+
| Loss | ~0.35 |
|
38 |
+
|
39 |
+
> Evaluated using the Hugging Face `Trainer`'s default compute_metrics function. These results are not representative of full training performance due to the limited data and short training schedule.
|
40 |
+
|
41 |
+
---
|
42 |
+
|
43 |
+
**Not intended for production use** β this version is trained on a tiny subset of the dataset.
|
44 |
+
|
45 |
+
---
|
46 |
+
|
47 |
+
## ποΈ Training details
|
48 |
+
|
49 |
+
* **Base model**: distilbert-base-uncased
|
50 |
+
* **Dataset**: [GLUE/SST-2](https://huggingface.co/datasets/glue/viewer/sst2/train) (5%)
|
51 |
+
* **Hardware**: NVIDIA RTX 4060 Laptop GPU
|
52 |
+
* **Batch size**: 32
|
53 |
+
* **Epochs**: 1
|
54 |
+
* **Precision**: mixed-precision (`fp16`)
|
55 |
+
* **Trainer**: Hugging Face `Trainer`
|
56 |
+
|
57 |
+
---
|
58 |
+
|
59 |
+
## π Files
|
60 |
+
|
61 |
+
* `pytorch_model.bin` β model weights
|
62 |
+
* `config.json` β architecture details
|
63 |
+
* `tokenizer.json`, `vocab.txt`, etc. β tokenizer files from distilbert-base-uncased
|
64 |
+
|
65 |
+
---
|
66 |
+
|
67 |
+
## βοΈ Author
|
68 |
+
|
69 |
+
This model was fine-tuned and published by [@akryshtal](https://huggingface.co/akryshtal) as part of a machine learning engineering demo project.
|