Future use of gemma knowledge distillation
#4
by
gt332a
- opened
Great model! Maybe in the next version you could use knowledge distillation like Google did to make the 9B Gemma to be much more powerful despite its size. I think this could benefit even small models like this one.