No, this is promising
Now you're talking! Now that you know how LoRAs can help merges of text generation models, I invite you to consider using this LoRA on a base model:
https://huggingface.co/sometimesanotion/LoRA-256-Base-Qwenvergence
Because the LoRA captures the difference between a base and target model, you want the base to differ in ways that your other models, especially the consensus merge models in anything TIES, will work off of. I know you can add up the rest. You actually got me started thinking about custom base models to begin with.
My recipe here is complete. https://huggingface.co/sometimesanotion/Base-Chocolatine-2-14B-Instruct-v2.0b3
Thank you I'll take a look!
Now you're talking! Now that you know how LoRAs can help merges of text generation models, I invite you to consider using this LoRA on a base model:
https://huggingface.co/sometimesanotion/LoRA-256-Base-Qwenvergence
Because the LoRA captures the difference between a base and target model, you want the base to differ in ways that your other models, especially the consensus merge models in anything TIES, will work off of. I know you can add up the rest. You actually got me started thinking about custom base models to begin with.
My recipe here is complete. https://huggingface.co/sometimesanotion/Base-Chocolatine-2-14B-Instruct-v2.0b3
Ok mate I need help haha it basically beats everything across the board but loses huge on IFEval... If I remember correctly you found a way to minimize the IFEval loss right? :)
Yes, here's the deal! Think of this as being like parameters frozen in fine-tuning. By injecting the same LoRA to members of a merge, there's a tiny fraction of the model with no differences, meaning in that most merge styles that part remains the same while all else merges.