Perfect combo?
In terms of instruction following, this model is absolute perfection. I've never seen another model come anywhere close to this level of accuracy.
Moistral v4 was a step down (IMHO). Its replies are short and dry in comparison. Maybe it's more accurate, but v3 was already VERY good at accuracy compared to an average model and does not need improvements in that area at the expense of prose quality.
I've experimented with a bunch of models, and right now I'd say the best ones are Moistral v3 and neural-chat-7b-v3-16k:
https://huggingface.co/NurtureAI/neural-chat-7b-v3-16k
Neural-chat has much better prose than Moistral v3, but its accuracy (instruction following) is not even close. If the strengths of these two models could be combined, I think it would be head and shoulders above everything else.
I don't know if this method is even used still or something better has been developed since, but maybe a gradient merge could make sense? I think Undi95 ran some experiments with Unholy model, concluding that alignment and logic mostly lies in lower layers of the model (about 8 to 12 of them). I've also heard that upper layers are more responsible for "flowery" language and creativity. If this is correct, I'd try using bottom layers from Moistral v3 and shift towards neural chat in higher layers.
In terms of instruction following, this model is absolute perfection. I've never seen another model come anywhere close to this level of accuracy.
Have you compared it to the original Fimbulvert v2?
Have you compared it to the original Fimbulvert v2?
I will double-check, but I remember it being much better than Fimbulvert. It could be that I tried Fimbulvert v1 instead of v2, but I don't think it's the case. Will post here once I test it again.
I ran Moistral v3 and Fimbulvetr v2 side by side, and so far did not "catch" Fimbulvetr being less accurate. Maybe I was testing an older version, after all...
Comparing Moistral v3 and Vimbulvetr v2, my impressions were:
[-] Fimbulvetr is more robotic and rigid in its delivery
[?] Fimbulvetr is (sometimes?) more verbose, could be a fluke
[+] Fimbulvetr is less likely to write nonsensical babble