π§ SLM-GPT2: Tiny Shakespeare GPT-2 Model
SLM-GPT2
is a small GPT-2-like language model trained from scratch on the Tiny Shakespeare dataset. Itβs a toy model meant for educational purposes, experimentation, and understanding how transformer-based language models work.
β¨ Model Details
- Architecture: GPT-2 (custom config)
- Layers: 4
- Hidden size: 256
- Heads: 4
- Max sequence length: 128
- Vocabulary size: Same as tokenizer (based on
distilgpt2
or custom) - Training epochs: 3
- Dataset: tiny_shakespeare
π§ͺ Intended Use
- Educational demos
- Debugging/training pipeline validation
- Low-resource inference tests
- Not suitable for production or accurate text generation
π« Limitations
- Trained on a tiny dataset (~100 KB)
- Limited vocabulary and generalization
- Can generate incoherent or biased outputs
- Not safe for deployment in real-world applications
π» How to Use
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
model = AutoModelForCausalLM.from_pretrained("your-username/slm-gpt2")
tokenizer = AutoTokenizer.from_pretrained("your-username/slm-gpt2")
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
output = generator("To be or not to be", max_length=50)
print(output[0]['generated_text'])