|
--- |
|
language: en |
|
tags: |
|
- shakespeare |
|
- gpt2 |
|
- text-generation |
|
- english |
|
license: mit |
|
datasets: |
|
- shakespeare |
|
--- |
|
|
|
# Shakespeare GPT-2 |
|
|
|
A GPT-2 model fine-tuned on Shakespeare's complete works to generate Shakespeare-style text. |
|
|
|
## Model Description |
|
|
|
This model is a fine-tuned version of GPT-2 (124M parameters) trained on Shakespeare's complete works. It can generate text in Shakespeare's distinctive style, including dialogue, soliloquies, and dramatic prose. |
|
|
|
### Model Architecture |
|
|
|
- Base Model: GPT-2 (124M parameters) |
|
- Layers: 12 |
|
- Heads: 12 |
|
- Embedding Dimension: 768 |
|
- Context Length: 1024 tokens |
|
- Total Parameters: ~124M |
|
|
|
### Training Details |
|
|
|
- Dataset: Complete works of Shakespeare |
|
- Training Steps: 100,000 |
|
- Batch Size: 4 |
|
- Sequence Length: 32 |
|
- Learning Rate: 3e-4 |
|
- Optimizer: AdamW |
|
- Device: MPS/CUDA/CPU |
|
|
|
## Intended Use |
|
|
|
This model is intended for: |
|
- Generating Shakespeare-style text |
|
- Creative writing assistance |
|
- Educational purposes in literature |
|
- Entertainment and artistic projects |
|
|
|
## Limitations |
|
|
|
- May generate text that mimics but doesn't perfectly replicate Shakespeare's style |
|
- Limited by training data to Shakespeare's vocabulary and themes |
|
- Can produce anachronistic or inconsistent content |
|
- Maximum context length of 1024 tokens |
|
|
|
## Training Data |
|
|
|
The model was trained on Shakespeare's complete works, including: |
|
- All plays (comedies, tragedies, histories) |
|
- Sonnets and poems |
|
- Total training tokens: [Insert number of tokens] |
|
|
|
## Performance |
|
|
|
The model achieves: |
|
- Training Loss: [Insert final training loss] |
|
- Best Loss: [Insert best loss achieved] |
|
|
|
## Example Usage |
|
python |
|
from transformers import GPT2LMHeadModel, GPT2Tokenizer |
|
Load model and tokenizer |
|
model_name = "your-username/shakespeare-gpt" |
|
tokenizer = GPT2Tokenizer.from_pretrained(model_name) |
|
model = GPT2LMHeadModel.from_pretrained(model_name) |
|
Generate text |
|
prompt = "To be, or not to be," |
|
input_ids = tokenizer.encode(prompt, return_tensors='pt') |
|
output = model.generate( |
|
input_ids, |
|
max_length=500, |
|
temperature=0.8, |
|
top_k=40, |
|
do_sample=True |
|
) |
|
generated_text = tokenizer.decode(output[0], skip_special_tokens=True) |
|
print(generated_text) |
|
|
|
## Sample Outputs |
|
Prompt: "To be, or not to be," |
|
Output: [Insert sample generation] |
|
Prompt: "Friends, Romans, countrymen," |
|
Output: [Insert sample generation] |
|
|