A good surprise

#1
by Nexesenex - opened

I downloaded the Q3_K of QuartetAnemoi and benched it :

  • QuartetAnemoi-70B-t0.0001-Q3_K.gguf,-,Hellaswag,89.75,,400,2024-02-13 00:00:00,,70b,Mistral_Medium,32768,,,GGUF,alchemonaut,alchemonaut,
  • QuartetAnemoi-70B-t0.0001-Q3_K.gguf,-,Hellaswag,88.6,,1000,2024-02-13 00:00:00,,70b,Mistral_Medium,32768,,,GGUF,alchemonaut,alchemonaut,
  • QuartetAnemoi-70B-t0.0001-Q3_K.gguf,-,Arc-Challenge,58.86287625,,299,2024-02-13 00:00:00,,70b,Mistral_Medium,32768,,,GGUF,alchemonaut,alchemonaut,
  • QuartetAnemoi-70B-t0.0001-Q3_K.gguf,-,Arc-Easy,76.84210526,,570,2024-02-13 00:00:00,,70b,Mistral_Medium,32768,,,GGUF,alchemonaut,alchemonaut,
  • QuartetAnemoi-70B-t0.0001-Q3_K.gguf,-,MMLU,49.52076677,,313,2024-02-13 00:00:00,,70b,Mistral_Medium,32768,,,GGUF,alchemonaut,alchemonaut,
  • QuartetAnemoi-70B-t0.0001-Q3_K.gguf,-,Thruthful-QA,42.22766218,,817,2024-02-13 00:00:00,,70b,Mistral_Medium,32768,,,GGUF,alchemonaut,alchemonaut,
  • QuartetAnemoi-70B-t0.0001-Q3_K.gguf,-,Winogrande,78.9266,,1267,2024-02-13 00:00:00,,70b,Mistral_Medium,32768,,,GGUF,alchemonaut,alchemonaut,
  • QuartetAnemoi-70B-t0.0001-Q3_K.gguf,-,wikitext,3.8404,512,512,2024-02-13 00:00:00,70b,Mistral_Medium,32768,,,GGUF,alchemonaut,alchemonaut,
  • QuartetAnemoi-70B-t0.0001-Q3_K.gguf,-,wikitext,3.3427,4096,4096,2024-02-13 00:00:00,,70b,Mistral_Medium,32768,,,GGUF,alchemonaut,alchemonaut,

It's a serious work, Miqu is not degraded whatsoever by the merges, the rope theta 1,000,000 still works, and the way you merge is quite similar to what I would do, in the principles, if I had the know how and rig. I greatly support this approach! (I love as well Gryphe's MergeMonster capabilities to trim GPTisms and Llamaisms during a merge, but that's something else : https://github.com/Gryphe/MergeMonster/ )

I'm gonna use your model for a while in ST.

Beside, Wintergoddess and Aurora Nights are very good choices, I'd persononally pick LZLV instead of Xwin then, and why not Spicyboros 2.2 who is quite good as well, but that's to be picky.

Also, https://huggingface.co/Undi95/Miqu-70B-Alpaca-DPO can be an interesting base choice because it seems undegraded compared to Miqu despite the DPO training.

Anyway, great job, congrats, and thanks! (and forgive my manners!)

Also, I'd suggest you, if you have the resources to spare for quantizations, to make an iMatrix (wikitext.train.raw, 512ctx, 2000 chunks is ideal, as Artefact 2 does on his own quants) on LlamaCPP, then to make the IQ3_XXS, IQ2_XS, and IQ1_S quants so everybody can enjoy this gem with well made SOTA quants! (All quants benefit from iMatrix btw, the K Quants as well)

Same goes for Boreangale!

I agree. This model is a veritable gem, in that it seems truly outstanding.

Just curious: how much is there left to enjoy at IQ1_S? ;-)

I don't know, an average 7b in the v3 version.
But on Q3_K_M, what a delight...

Sign up or log in to comment