Model Card for Model ID
Model Details
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: xTimeCrystal
- Model type: Grouped Query Attention, Mixture of Experts
- Language(s) (NLP): English
- License: apache-2.0
- Finetuned from model [optional]: Qwen/Qwen3-0.6B
This is from the PEFT tutorial, and I used Qwen3 because it is newer. However it quickly overfit on the data, so I used lr=1e-3 instead and only trained for 2 epochs.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support