How to transform the existing 1.8B into Qwen1.5-MoE-A2.7B?

#1
by wnma3mz - opened

Great job.

I found that it was introduced in the article that the current MoE model is transformed through the Qwen-1.8B model, but the intermediate_size of the 1.8B model is 5504. Currently, intermediate_size is 1408. These numbers are not divisible.

So, I'm confused about how this is done. Please point out if I have any mistakes.

Qwen org

We'll release our technical report later.

Thank you very much for your reply and look forward to your report.

wnma3mz changed discussion status to closed

Sign up or log in to comment