AlphaMonarch-daser

image/jpeg

AlphaMonarch-daser is a mixture of two techniques that are LaserQlora and Dora. This model is a DPO fine-tuned of mlabonne/NeuralMonarch-7B using the argilla/OpenHermes2.5-dpo-binarized-alpha preference dataset. I have fine-tuned this model only on half of the projections, but have achieved better results as compared to the version released AlphaMonarch-dora. I have trained this model for 1080 steps. Comparison of AlphaMonarch, AlphaMonarch-laser, AlphaMonarch-daser, and AlphaMonarch-dora on the OpenLLM leaderboard are:

πŸ† Evaluation results

On YALL leaderboard: AlphaMonarch-daser > AlphaMonarch-dora > AlphaMonarch > AlphaMonarch-laser

image/png

On OpenLLM bench: AlphaMonarch-laser > AlphaMonarch > AlphaMonarch-daser > AlphaMonarch-dora

image/png

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • training_steps: 1080

Framework versions

  • Transformers 4.38.0.dev0
  • Pytorch 2.1.2+cu118
  • Datasets 2.17.0
  • Tokenizers 0.15.0
Downloads last month
14
Safetensors
Model size
7.24B params
Tensor type
FP16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for abideen/AlphaMonarch-daser

Finetuned
(20)
this model
Merges
2 models
Quantizations
2 models

Dataset used to train abideen/AlphaMonarch-daser

Collection including abideen/AlphaMonarch-daser