GitHub Tag Generator with T5 + PEFT (LoRA)

This model is a fine-tuned version of t5-small using Parameter-Efficient Fine-Tuning (PEFT) with LoRA adapters. It is trained to generate relevant, deduplicated tags based on natural language descriptions of GitHub repositories. The goal is to assist in automatic tagging for improved search, discoverability, and categorization of repositories.

Model Details

Model Description

This model is part of a lightweight, end-to-end pipeline for automatic tag generation. It takes a short GitHub repo summary as input and returns a comma-separated list of tags. The model was fine-tuned using the PEFT library with LoRA to optimize only a small subset of parameters for efficiency and portability.

Model type: Seq2Seq text generation (T5)
Language(s): English
License: Apache 2.0
Fine-tuned from model: t5-small on Hugging Face
LoRA Adapter Configuration: r=16, alpha=32, dropout=0.05, target modules: ["q", "v"]

Model Sources

Dataset: zamal/github-meta-data
Model Repo: zamal/github-tag-generatorr
Training Notebook: GitHub Tag Generator Notebook

Uses

Direct Use

The model can be used directly via the 🤗 Transformers pipeline or with generate() for:

Auto-tagging GitHub repos based on descriptions
Enhancing search filters in dev tools
Bootstrapping tags for new AI project listings

Example:

from transformers import pipeline
tag_generator = pipeline("text2text-generation", model="zamal/github-tag-generatorr")

text = "Looking for repos that show real-world AI use cases with open-source tools"
tags = tag_generator(text)[0]["generated_text"]
print(tags)  # e.g. ai, ml, open-source, examples

zamal
/

github-tag-generatorr