I think what you're doing here is really helpful

#2
by sometimesanotion - opened

We're seeing more merges from different sources of CoT here, and my experience is that it can mostly work with the right layer targeting, but not flawlessly without tweaking the tokens. I like what you've done with the config!

Yeah, i'm also kinda learning something from this experiment.
Maybe we need remap the token first before merging with another model. As myself doesn't know how the remapping work, but it seems like the model still confused(?).
https://huggingface.co/djuna/TEST3-Q2.5-Lenned-14B/discussions/1

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment