Hi, how did it do it
#1
by
Sariel00
- opened
Can the MOE model reduce the number of parameters by reducing the number of experts? How did it do it?
This may be a model suitable for edge mobile devices. Is there a model for w8a8?