|
--- |
|
library_name: transformers |
|
tags: |
|
- trl |
|
- sft |
|
- code |
|
- competitive-programming |
|
- codeforces |
|
- lora |
|
license: mit |
|
datasets: |
|
- open-r1/codeforces-cots |
|
base_model: |
|
- google/gemma-2-2b-it |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Gemma-2-2b Fine-tuned for Competitive Programming |
|
|
|
This model is a fine-tuned version of [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) on the [open-r1/codeforces-cots](https://huggingface.co/datasets/open-r1/codeforces-cots) dataset for competitive programming problem solving. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
This model has been fine-tuned using LoRA (Low-Rank Adaptation) on competitive programming problems from Codeforces. It's designed to help generate solutions for algorithmic and data structure problems commonly found in competitive programming contests. |
|
|
|
- **Developed by:** Aswith77 |
|
- **Model type:** Causal Language Model (Code Generation) |
|
- **Language(s):** Python, C++, Java (primarily Python) |
|
- **License:** MIT |
|
- **Finetuned from model:** google/gemma-2-2b-it |
|
- **Fine-tuning method:** LoRA (Low-Rank Adaptation) |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [Hugging Face Model](https://huggingface.co/Aswith77/gemma-2-2b-it-finetune-codeforces-cots) |
|
- **Base Model:** [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) |
|
- **Dataset:** [open-r1/codeforces-cots](https://huggingface.co/datasets/open-r1/codeforces-cots) |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
This model is intended for generating solutions to competitive programming problems, particularly those similar to Codeforces problems. It can: |
|
|
|
- Generate algorithmic solutions for given problem statements |
|
- Help with code completion for competitive programming |
|
- Assist in learning algorithmic problem-solving patterns |
|
|
|
### Downstream Use |
|
|
|
The model can be further fine-tuned on: |
|
- Specific programming languages |
|
- Domain-specific algorithmic problems |
|
- Educational coding platforms |
|
|
|
### Out-of-Scope Use |
|
|
|
This model should not be used for: |
|
- Production code without thorough testing |
|
- Security-critical applications |
|
- General-purpose software development without validation |
|
- Problems requiring real-world system design |
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from peft import PeftModel |
|
import torch |
|
|
|
# Load the base model |
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
"google/gemma-2-2b-it", |
|
torch_dtype=torch.float16, |
|
device_map="auto" |
|
) |
|
|
|
# Load the fine-tuned LoRA adapters |
|
model = PeftModel.from_pretrained( |
|
base_model, |
|
"Aswith77/gemma-2-2b-it-finetune-codeforces-cots" |
|
) |
|
|
|
# Load tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it") |
|
|
|
# Generate code for a problem |
|
problem = """ |
|
Given an array of integers, find the maximum sum of a contiguous subarray. |
|
Input: [-2,1,-3,4,-1,2,1,-5,4] |
|
Output: 6 (subarray [4,-1,2,1]) |
|
""" |
|
|
|
inputs = tokenizer(problem, return_tensors="pt") |
|
outputs = model.generate( |
|
**inputs, |
|
max_length=512, |
|
temperature=0.7, |
|
do_sample=True, |
|
pad_token_id=tokenizer.eos_token_id |
|
) |
|
|
|
solution = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(solution) |
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
The model was trained on the [open-r1/codeforces-cots](https://huggingface.co/datasets/open-r1/codeforces-cots) dataset, specifically using 1,000 competitive programming problems and their solutions from Codeforces. |
|
|
|
### Training Procedure |
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** fp16 mixed precision |
|
- **Learning rate:** 2e-4 |
|
- **Batch size:** 1 (per device) |
|
- **Gradient accumulation steps:** 2 |
|
- **Max steps:** 100 |
|
- **Warmup steps:** 5 |
|
- **Optimizer:** AdamW 8-bit |
|
- **Weight decay:** 0.01 |
|
- **LoRA rank (r):** 16 |
|
- **LoRA alpha:** 32 |
|
- **LoRA dropout:** 0.1 |
|
- **Target modules:** q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj |
|
|
|
#### Speeds, Sizes, Times |
|
|
|
- **Training time:** ~20 minutes |
|
- **Hardware:** Tesla T4 GPU (16GB) |
|
- **Model size:** ~30MB (LoRA adapters only) |
|
- **Final training loss:** 0.3715 |
|
- **Training samples per second:** 0.338 |
|
|
|
## Evaluation |
|
|
|
The model achieved a training loss of 0.3715, showing good convergence from an initial loss of 0.9303. The loss curve demonstrated steady improvement throughout training without signs of overfitting. |
|
|
|
### Training Loss Progression |
|
|
|
- Initial loss: 0.9303 |
|
- Final loss: 0.3715 |
|
- Loss reduction: ~60% |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
### Limitations |
|
|
|
- **Dataset bias:** Trained primarily on Codeforces problems, may not generalize well to other competitive programming platforms |
|
- **Language bias:** Solutions may favor certain programming patterns common in the training data |
|
- **Size limitations:** Being a 2B parameter model, it may struggle with very complex algorithmic problems |
|
- **Code correctness:** Generated code should always be tested and validated before use |
|
|
|
### Recommendations |
|
|
|
- Always test generated solutions with multiple test cases |
|
- Use the model as a starting point, not a final solution |
|
- Verify algorithmic correctness and time complexity |
|
- Consider the model's suggestions as one approach among many possible solutions |
|
|
|
## Environmental Impact |
|
|
|
Training was conducted on a single Tesla T4 GPU for approximately 20 minutes, resulting in minimal environmental impact compared to larger scale training runs. |
|
|
|
- **Hardware Type:** Tesla T4 GPU |
|
- **Hours used:** 0.33 hours |
|
- **Cloud Provider:** Kaggle |
|
- **Compute Region:** Not specified |
|
- **Carbon Emitted:** Minimal (estimated < 0.1 kg CO2eq) |
|
|
|
## Technical Specifications |
|
|
|
### Model Architecture and Objective |
|
|
|
- **Base Architecture:** Gemma-2 (2B parameters) |
|
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation) |
|
- **Objective:** Causal language modeling with supervised fine-tuning |
|
- **Quantization:** 4-bit quantization during training |
|
|
|
### Compute Infrastructure |
|
|
|
#### Hardware |
|
|
|
- **GPU:** Tesla T4 (16GB VRAM) |
|
- **Platform:** Kaggle Notebooks |
|
|
|
#### Software |
|
|
|
- **Framework:** PyTorch, Transformers, PEFT, TRL |
|
- **Quantization:** bitsandbytes 4-bit |
|
- **Training:** Supervised Fine-Tuning (SFT) |
|
|
|
## Model Card Authors |
|
|
|
Created by Aswith77 during fine-tuning experiments with competitive programming datasets. |
|
|
|
## Model Card Contact |
|
|
|
For questions or issues regarding this model, please open an issue in the model repository. |