Synthetic data generation
Hi, first of all thanks for releasing such a good model. Could you please clarify the licensing of the outputs of the model, in the license it says that all software and derivatives should have the same license and the "kimi-2" notice, is the same thing also the case for models trained with the outputs of this model?
Yes, we have no additional restrictions on the output generated by the model. Just follow the model license.
Sorry for repeating the question, but to be clear, can we train models on the outputs of Kimi-K2 base/instruct and share it with apache-2/mit licence. Or can we share the generated synthetic dataset with a permissive OS licence like apache-2 or mit?
Our modification term to the MIT license applies to the model and derivative works. Text data generated by the model is NOT considered as a derivative work.
In other words, you may use the data generated by Kimi-K2 base/instruct to build and distribute datasets, or to train other models.