Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
NeelNanda
/
Attn-Only-2L512W-Shortformer-6B-big-lr
like
0
Transformers
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
Attn-Only-2L512W-Shortformer-6B-big-lr
/
scheduler_state_dict.pth
Commit History
Auto Commit
9aca331
NeelNanda
commited on
Oct 19, 2022