--- library_name: transformers tags: - code license: mit base_model: - distilbert/distilgpt2 datasets: - teven/code_contests language: - en --- # Model Card for Model ID ## Model Details ### Model Description This model is a LoRA fine-tuned version of distilgpt2, optimized for generating programming solutions in a style similar to competitive programming platforms such as LeetCode and Codeforces. It was trained on a custom dataset of ~5000 coding questions and answers and designed to be deployed with low-resource hardware (4GB VRAM GPU RTX 3050). The model is part of a larger project that incorporates Retrieval-Augmented Generation (RAG) to personalize outputs according to a user's historical coding patterns. - **Developed by:** https://github.com/Srinidhi-Yoganand - **Funded by :** Self-funded - **Shared by :** sriniidhi - **Model type:** Causal Language Model (Decoder-only) - **Language(s) (NLP):** English (programming-focused) - **License:** MIT - **Finetuned from model :** distilgpt2 ### Model Sources - **Repository:** [Link] - **Demo:** [Link] ## Uses ### Direct Use This model can be used for: - Auto-completing coding problems with competitive programming-style answers - Assisting in learning algorithms by showing step-by-step code solutions - Experimenting with personalized coding assistants ### Downstream Use It can be plugged into systems using RAG to personalize answers by analyzing a user’s prior code submissions, or integrated into IDE plugins or chat-based tutoring systems. ### Out-of-Scope Use - Generating natural language responses outside of programming tasks - Mission-critical code generation (e.g., medical, legal, or financial systems) ## Bias, Risks, and Limitations - May hallucinate code or logic for uncommon problems. - Not robust to complex multi-language code interactions or frameworks. ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. - Use in combination with RAG for best personalization results. - Validate generated code before execution. - Avoid relying solely on this model for production-critical code generation. ## How to Get Started with the Model Use the code below to get started with the model. ``` from transformers import AutoTokenizer, AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("sriniidhi/gpt2-coding") tokenizer = AutoTokenizer.from_pretrained("sriniidhi/gpt2-coding") prompt = "def two_sum(nums, target):" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training Details ### Training Data The dataset consists of 5000+ competitive programming Q&A-style examples extracted and formatted from LeetCode, Codeforces, and similar platforms. Each entry includes a problem prompt and a sample Python, Java, Cpp solution. ### Training Procedure LoRA fine-tuning using peft and transformers on top of distilgpt2. #### Preprocessing [optional] - Tokenized using GPT2TokenizerFast - Prompt-style formatting with problem + solution pairs - All code lowercased for consistency #### Training Hyperparameters - **Training regime:** - Epochs: 3 - Batch size: 2 - Learning rate: 5e-5 - LoRA rank: 8 - Precision: fp16 - Max length: 512 - Optimizer: AdamW #### Speeds, Sizes, Times - Fine-tuned on Google Colab and locally on RTX 3050 4GB - Training duration: ~30 hours - LoRA-adapted weights: ~75MB ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data Evaluation done on 10000 held-out samples not used during training. #### Metrics Manual evaluation of logical correctness and style similarity ### Results - Approx. 80% logical match on simple algorithm questions - Maintains coding style reasonably for most basic prompts - Some struggles with complex nested logic ## Model Examination - Focused on learning indentation, loop constructs, and simple algorithm templates - No external code memory or global context unless paired with RAG ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** NVIDIA RTX 3050 (4GB VRAM) - **Hours used:** ~30 hrs - **Cloud Provider:** Google Colab (partial) - **Compute Region:** India - **Carbon Emitted:** ~2.15 kg CO₂ eq (estimated) ## Technical Specifications ### Model Architecture and Objective Decoder-only Transformer (distilgpt2, 6-layer GPT-2) ### Compute Infrastructure - Colab + Local - PyTorch, transformers, peft, bitsandbytes #### Hardware - Ryzen 7 - RTX 3050 ## Citation **BibTeX:** ``` @misc{gpt2-coding, author = {Srinidhi}, title = {LoRA Fine-tuned distilgpt2 for Code Generation}, year = {2025}, url = {https://huggingface.co/sriniidhi/gpt2-coding} } ``` ## Model Card Contact GitHub: https://github.com/Srinidhi-Yoganand Hugging Face: https://huggingface.co/sriniidhi