How?

#1
by rdsm - opened

Sorry, Can you share how did you do it?

Anthracite Core org

Not sure how it was done here, but I was able to replicate something similar by using the Mistral conversion script provided in the Hugging Face Transformers lib and letting it skip keys it couldn't convert.

@mrfakename I see , I went a similar route and stopped at the errors, I assume the hope is that those unknown keys are the vision ones. where you able to compare with the api one with 0 temp?

Wasn't able to compare with the API model but I tested the model a bit and it seemed coherent enough. And can confirm that the layers skipped are vision layers - the model seems to perform fine w/o them.

Posted a modified version of the conversion script here: https://gist.github.com/fakerybakery/d7e88b46846e929168f6a4280a0f2baf - it prints out the layers it skipped

@mrfakename I think @Gryphe did something similar the sha256 of the safetensors seems to match the ones resulting from your script.

Thank you all, I was able to properly convert to hf format and then to mlx format.

I am running a text-only MLX 4 bit quant of the model locally without any issues.

Only thing was that I had to manually set the embedding size to 128k everything else was smooth.

Anthracite Core org

Hey y'all, seems you figured it out when I was happily asleep, lol. But yeah, that's precisely what I did!

I don't need a vision model for my finetunes (both personal and for AI Dungeon), so for me it's actually an improvement since it results in a smaller model.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment