OpenMOSE/HRWKV7-Reka-Flash3.1-Preview · Improve model card: Add metadata, paper abstract, links & transformers usage

nielsr

about 1 month ago

This PR significantly improves the model card for HRWKV7-Reka-Flash3.1-Preview by:

Adding essential metadata: pipeline_tag: text-generation, library_name: transformers, and comprehensive tags (rwkv, linear-attention, reka, distillation, knowledge-distillation, hybrid-architecture, language-model). This enhances discoverability and enables the "how to use" widget on the Hub.
Adding the paper abstract for better context on the model's development via the RADLADS protocol.
Updating the paper link to the official Hugging Face Papers page: RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale.
Adding direct links to the main RADLADS project GitHub repository (https://github.com/recursal/RADLADS) and clarifying the link to this model's specific training code (https://github.com/OpenMOSE/RWKVInside).
Replacing the non-standard curl usage snippet with a clear Python code example using the Hugging Face transformers library for easy model loading and generation.
Adding the paper's BibTeX citation for proper attribution.

Please review and merge this PR if everything looks good.

OpenMOSE changed pull request status to merged 30 days ago