Try moe with different size models!

by 21world - opened 3 days ago

Discussion

21world

3 days ago

•

edited 3 days ago

For example:

Expert 1:
0.6B-model

Expert 2:
1.7B-model

Expert 3:
32B-model

......

Dual stage selection.

Stage 1 - Expert 1:
Stage 2 (0.6B-model-1, 0.6B-model-2, 0.6B-model-3, 0.6B-model-4 )

Stage 1 - Expert 2:
Stage 2 (1.7B-model-1, 1.7B-model-2, 1.7B-model-3, 1.7B-model-4 )

Stage 1 - Expert 3:
Stage 2 (32B-model-1, 32B-model-2, 32B-model-3, 32B-model-4 )

and so on

21world

3 days ago

what's difference if all models are in one stage :)

21world changed discussion status to closed 3 days ago

21world changed discussion status to open 3 days ago

huihui-ai

Owner 3 days ago

Yes, it is possible to combine models of different sizes, such as a 0.6B (600M parameters) and a 1.7B (1.7B parameters) model, using a Mixture of Experts (MoE) architecture. MoE works by assigning tasks to multiple "expert" models, with a gating network dynamically deciding which experts handle specific inputs. Here’s a concise explanation of how this can be done

Combining a 0.6B and 1.7B model via MoE is feasible and can balance efficiency and performance. The key is designing an effective gating network and addressing model compatibility.

Direct integration does not work; it requires later training.

huihui-ai

Owner 3 days ago

If the 0.6B and 1.7B models come from the same family, they typically share the same vocabulary and input/output dimensions (e.g., hidden size), allowing direct integration into the MoE architecture.

If the model architectures differ (e.g., different hidden sizes or layer counts), adapter layers must be added at the expert input to project inputs into a unified feature space, or dimensions must be aligned at the output.

huihui-ai

Owner 3 days ago

Thank you for your idea. We will try to see if we can achieve the goal without fine-tuning.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment