TinyStories-GPT

This is a small GPT-like model trained from scratch on the TinyStories dataset.
It was implemented using a NanoGPT-style training loop in PyTorch.

Model Details

  • Architecture: 6 layers, 6 heads, 384 hidden size
  • Context length: 128 tokens
  • Vocab size: 50257 (GPT-2 tokenizer)
  • Dataset: TinyStories
  • Training: ~20k steps, AdamW, cosine LR decay

Example Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Here2Disrupt/tiny-stories-gpt")
model = AutoModelForCausalLM.from_pretrained("Here2Disrupt/tiny-stories-gpt")

prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0]))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support