🧩 Qwen3-Flask LoRA Adapter (Q&A Fine-Tuning)

This repository contains the LoRA adapter only version of Qwen/Qwen3-0.6B-Base, fine-tuned on a high-quality dataset derived from Flask's official documentation, source code, and changelogs.

Use this if you want a lightweight, plug-and-play adapter on top of the base Qwen3-0.7B model.

🎯 Objective

Help developers understand Flask’s internals more intuitively
Convert verbose docstrings and changelogs into question-answer pairs
Enable real-world integration using Alpaca-style instruction prompts

🧠 Use Cases

Answer questions about Flask APIs (before_request, url_defaults, etc.)
Provide upgrade/migration insights from older Flask versions
Summarize internal logic in conversational, Q&A format

🛠️ Adapter Details

Setting	Value
PEFT method	LoRA (Low-Rank Adaptation)
LoRA Rank	16
Alpha	32
Target Modules	`query_key_value`
Quantization	4-bit NF4 (bitsandbytes)
Base Model	Qwen/Qwen3-0.6B-Base

🧪 Dataset Overview

Total chunks extracted from Flask: 804
Valid, logic-rich chunks selected: 345
Final Gemini-generated Q&A pairs: 1,425

Example:

{
  "instruction": "How does `url_defaults` work in Flask?",
  "input": "When used on an app, this is called for every request...",
  "output": "`url_defaults` is triggered for every request when registered on an app. When registered on a blueprint, it affects only requests handled by that blueprint..."
}

🧠 Prompt Format (Alpaca Style)

### Instruction:
What is the difference between `url_defaults` on app vs blueprint?

### Input:
Docstring excerpt from Flask...

### Response:
<Model-generated explanation>

🧪 How to Use (with PEFT)

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B-Base", trust_remote_code=True, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B-Base", trust_remote_code=True)

model = PeftModel.from_pretrained(base_model, "devanshdhir/qwen3-flask-lora")

prompt = """### Instruction:
What does `before_request` do in Flask?
### Input:
None
### Response:"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=300)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🔁 Want the Merged Model?

Use devanshdhir/qwen3-flask-full It has the LoRA adapter merged into the base weights for direct inference — no PEFT loading required.

⚠️ Limitations

Responses depend on Alpaca-style prompting
Does not generalize well outside Flask/internal documentation
Data generated using Gemini, not manually curated

🔗 Related

🧠 Base model: Qwen/Qwen3-0.6B-Base
🧠 Merged version: devanshdhir/qwen3-flask-full
🧪 PEFT Documentation: https://github.com/huggingface/peft

📎 Citation

@misc{qwen3flasklora2025,
  title = {Qwen3-Flask LoRA Adapter},
  author = {Devansh Dhir},
  year = {2025},
  url = {https://huggingface.co/devanshdhir/qwen3-flask-lora}
}

devanshdhir
/

qwen3-flask-lora