Model Card for Model ID

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: xTimeCrystal
  • Model type: Grouped Query Attention, Mixture of Experts
  • Language(s) (NLP): English
  • License: apache-2.0
  • Finetuned from model [optional]: Qwen/Qwen3-0.6B

This is from the PEFT tutorial, and I used Qwen3 because it is newer. However it quickly overfit on the data, so I used lr=1e-3 instead and only trained for 2 epochs.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for xTimeCrystal/qwen3-0.6b-peft-method

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(177)
this model

Dataset used to train xTimeCrystal/qwen3-0.6b-peft-method