task vector optimization checkpoint ready for merging.
trained on MFANN for 12000 steps, however due to a slightly higher training loss, im going to merge this model with the last version and retrain, the goal was to use DARE-TIES to reduce the parameters used per vector, and this model will now be merged with the last model before DARE using TIES alone, and will be subsequently retrained.
- Downloads last month
- 6
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.