How?
Sorry, Can you share how did you do it?
Not sure how it was done here, but I was able to replicate something similar by using the Mistral conversion script provided in the Hugging Face Transformers lib and letting it skip keys it couldn't convert.
@mrfakename I see , I went a similar route and stopped at the errors, I assume the hope is that those unknown keys are the vision ones. where you able to compare with the api one with 0 temp?
Wasn't able to compare with the API model but I tested the model a bit and it seemed coherent enough. And can confirm that the layers skipped are vision layers - the model seems to perform fine w/o them.
Posted a modified version of the conversion script here: https://gist.github.com/fakerybakery/d7e88b46846e929168f6a4280a0f2baf - it prints out the layers it skipped
@mrfakename I think @Gryphe did something similar the sha256 of the safetensors seems to match the ones resulting from your script.
Thank you all, I was able to properly convert to hf format and then to mlx format.
I am running a text-only MLX 4 bit quant of the model locally without any issues.
Only thing was that I had to manually set the embedding size to 128k everything else was smooth.
Hey y'all, seems you figured it out when I was happily asleep, lol. But yeah, that's precisely what I did!
I don't need a vision model for my finetunes (both personal and for AI Dungeon), so for me it's actually an improvement since it results in a smaller model.