No, this is promising

#1
by CultriX - opened

@sometimesanotion No, this is promising :)

image (1).webp

Now you're talking! Now that you know how LoRAs can help merges of text generation models, I invite you to consider using this LoRA on a base model:

https://huggingface.co/sometimesanotion/LoRA-256-Base-Qwenvergence

Because the LoRA captures the difference between a base and target model, you want the base to differ in ways that your other models, especially the consensus merge models in anything TIES, will work off of. I know you can add up the rest. You actually got me started thinking about custom base models to begin with.

My recipe here is complete. https://huggingface.co/sometimesanotion/Base-Chocolatine-2-14B-Instruct-v2.0b3

Thank you I'll take a look!

Now you're talking! Now that you know how LoRAs can help merges of text generation models, I invite you to consider using this LoRA on a base model:

https://huggingface.co/sometimesanotion/LoRA-256-Base-Qwenvergence

Because the LoRA captures the difference between a base and target model, you want the base to differ in ways that your other models, especially the consensus merge models in anything TIES, will work off of. I know you can add up the rest. You actually got me started thinking about custom base models to begin with.

My recipe here is complete. https://huggingface.co/sometimesanotion/Base-Chocolatine-2-14B-Instruct-v2.0b3

Ok mate I need help haha it basically beats everything across the board but loses huge on IFEval... If I remember correctly you found a way to minimize the IFEval loss right? :)

newplot (1).png

Yes, here's the deal! Think of this as being like parameters frozen in fine-tuning. By injecting the same LoRA to members of a merge, there's a tiny fraction of the model with no differences, meaning in that most merge styles that part remains the same while all else merges.

Sign up or log in to comment