Safetensors
qwen2

DeepSeek-R1-UD-IQ1_S merge

#3
by heroOfOrion - opened

https://unsloth.ai/blog/deepseekr1-dynamic
https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_S

Will it be possible to merge this 1.58 bit dynamic quant model with SkyT1-Flash?

gguf is not model

Yes, these GGUF-Version of DeepSeek-R1 still has different architectures and scales with SkyT1-Flash. Thus we can not direct merge these models. However, we have new techniques for heterogeneous model fusion with different architectures and scales. We try to release a new version of FuseO1 with heterogeneous model fusion and reinforcement learning approaches. Stay tuned!

Yes, these GGUF-Version of DeepSeek-R1 still has different architectures and scales with SkyT1-Flash. Thus we can not direct merge these models. However, we have new techniques for heterogeneous model fusion with different architectures and scales. We try to release a new version of FuseO1 with heterogeneous model fusion and reinforcement learning approaches. Stay tuned!

Very interesting, looking forward! Thank you

Sign up or log in to comment