Potential Inconsistencies Model and Datasets License

#4
by yueyangchen - opened

Hi, while reviewing the licenses for this model and datasets it depends on, I noticed a potential inconsistency that could cause confusion or legal risks in some situations.

Your model utilizes the dataset HuggingFaceH4/no_robots licensed under the cc-by-nc-4.0. However, the license of your model is apache-2.0, i.e., less strict than cc-by-nc-4.0 on license terms, such as commercial use, which may impact the whole license compatibility in your repository, thus confusing subsequent users and bringing possible legal and financial risks.

If possible, you can fix them in one of the following ways:
1.It could be helpful to select another proper license for your repository.
2.You may want to gently remind users that, in some cases, they should check both the model license and the base model license, especially when redistributing or modifying the model.

Protect AI org

Hey @yueyangchen ,
Thank you for bringing this to our attention.

You're absolutely right. We had overlooked the dataset licensing during initial release and are not actively looking to improve the model at the moment, but we certainly want to avoid causing confusion or downstream legal risk for users.

Following your suggestion, we'll be adding a clear clause in the model card to highlight that users should check both the model and dataset licenses. This should help others understand the limitations and stay compliant.

Thanks again for the helpful feedback!

Sign up or log in to comment