Pretrained toy models. Made with Andrej Karpathy's NanoGPT.
nano_35m
- Trained late 2023 on part of Tagalog portion of Belebele.
- batch_size = 64
- block_size = 256
- n_layer = 8
- n_head = 8
- n_embd = 768
- Everything else is left as is.
nano_76m
- Trained January 2024 on part of Tagalog portion of Belebele.
- batch_size = 64
- block_size = 256
- n_layer = 11
- n_head = 16
- n_embd = 768
- Everything else is left as is.
nano-ito_35m
- Trained March 2024 on part of PALITO Tagalog dataset.
- batch_size = 64
- block_size = 256
- n_layer = 11
- n_head = 16
- n_embd = 512
- Everything else is left as is.