--- license: mit datasets: - pacovaldez/stackoverflow-questions language: - en base_model: - google/flan-t5-base tokenuzer: - google/flan-t5-base library_name: transformers tags: - Stackoverflow - flan-t5 - peft - lora - seq2seq --- # ๐Ÿค– FLAN-T5 Base Fine-Tuned on Stack Overflow Questions (LoRA) This is a fine-tuned version of [`google/flan-t5-base`](https://huggingface.co/google/flan-t5-base) on a curated dataset of Stack Overflow programming questions. It was trained using [LoRA](https://arxiv.org/abs/2106.09685) (Low-Rank Adaptation) for parameter-efficient fine-tuning, making it compact, efficient, and effective at modeling developer-style Q&A tasks. --- ## ๐Ÿง  Model Objective The model is trained to: - Rewrite or improve unclear programming questions - Generate relevant clarifying questions or answers - Summarize long developer queries - Serve as a code-aware Q&A assistant --- ## ๐Ÿ“š Training Data - **Source**: Stack Overflow public questions dataset (cleaned) - **Format**: Instruction-like examples, Q&A pairs, summarization prompts - **Cleaning**: HTML stripping, markdown-to-text, code-preserved - **Size**: ~15k examples --- ## ๐Ÿ—๏ธ Training Details - **Base Model**: `google/flan-t5-base` - **Adapter Format**: LoRA using [`peft`](https://github.com/huggingface/peft) - **Files**: - `adapter_model.safetensors` - `adapter_config.json` - **Hyperparameters**: - `r`: 8 - `lora_alpha`: 16 - `lora_dropout`: 0.1 - `bias`: "none" - `task_type`: "SEQ_2_SEQ_LM" - **Inference Mode**: Enabled --- ## ๐Ÿ’ก How to Use ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from peft import PeftModel # Load tokenizer and base model tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-base") base_model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base") # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "your-model-folder") model.eval() # Inference prompt = "Rewrite this question more clearly: why is my javascript function undefined?" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ๐Ÿงช Intended Use This model is best suited for: Code-aware chatbot assistants Prompt engineering for developer tools Developer-focused summarization / rephrasing Auto-moderation / clarification of tech questions โš ๏ธ Limitations Not trained for code generation or long-form answers May hallucinate incorrect or generic responses Finetuned only on Stack Overflow โ€” domain-specific โœจ Acknowledgements Hugging Face Transformers LoRA (PEFT) Stack Overflow for open data FLAN-T5: Scaling Instruction-Finetuned Models ๐Ÿ› ๏ธ Created with love by Kunj | Model suggestion & guidance by ChatGPT