How to transform the existing 1.8B into Qwen1.5-MoE-A2.7B?
#1
by
wnma3mz
- opened
Great job.
I found that it was introduced in the article that the current MoE model is transformed through the Qwen-1.8B model, but the intermediate_size of the 1.8B model is 5504. Currently, intermediate_size is 1408. These numbers are not divisible.
So, I'm confused about how this is done. Please point out if I have any mistakes.
We'll release our technical report later.
Thank you very much for your reply and look forward to your report.
wnma3mz
changed discussion status to
closed