Gemma-2-2b Fine-tuned for Competitive Programming

This model is a fine-tuned version of google/gemma-2-2b-it on the open-r1/codeforces-cots dataset for competitive programming problem solving.

Model Details

Model Description

This model has been fine-tuned using LoRA (Low-Rank Adaptation) on competitive programming problems from Codeforces. It's designed to help generate solutions for algorithmic and data structure problems commonly found in competitive programming contests.

Developed by: Aswith77
Model type: Causal Language Model (Code Generation)
Language(s): Python, C++, Java (primarily Python)
License: MIT
Finetuned from model: google/gemma-2-2b-it
Fine-tuning method: LoRA (Low-Rank Adaptation)

Model Sources

Repository: Hugging Face Model
Base Model: google/gemma-2-2b-it
Dataset: open-r1/codeforces-cots

Uses

Direct Use

This model is intended for generating solutions to competitive programming problems, particularly those similar to Codeforces problems. It can:

Generate algorithmic solutions for given problem statements
Help with code completion for competitive programming
Assist in learning algorithmic problem-solving patterns

Downstream Use

The model can be further fine-tuned on:

Specific programming languages
Domain-specific algorithmic problems
Educational coding platforms

Out-of-Scope Use

This model should not be used for:

Production code without thorough testing
Security-critical applications
General-purpose software development without validation
Problems requiring real-world system design

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b-it",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load the fine-tuned LoRA adapters
model = PeftModel.from_pretrained(
    base_model,
    "Aswith77/gemma-2-2b-it-finetune-codeforces-cots"
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")

# Generate code for a problem
problem = """
Given an array of integers, find the maximum sum of a contiguous subarray.
Input: [-2,1,-3,4,-1,2,1,-5,4]
Output: 6 (subarray [4,-1,2,1])
"""

inputs = tokenizer(problem, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_length=512,
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

solution = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(solution)

Training Details

Training Data

The model was trained on the open-r1/codeforces-cots dataset, specifically using 1,000 competitive programming problems and their solutions from Codeforces.

Training Procedure

Training Hyperparameters

Training regime: fp16 mixed precision
Learning rate: 2e-4
Batch size: 1 (per device)
Gradient accumulation steps: 2
Max steps: 100
Warmup steps: 5
Optimizer: AdamW 8-bit
Weight decay: 0.01
LoRA rank (r): 16
LoRA alpha: 32
LoRA dropout: 0.1
Target modules: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj

Speeds, Sizes, Times

Training time: ~20 minutes
Hardware: Tesla T4 GPU (16GB)
Model size: ~30MB (LoRA adapters only)
Final training loss: 0.3715
Training samples per second: 0.338

Evaluation

The model achieved a training loss of 0.3715, showing good convergence from an initial loss of 0.9303. The loss curve demonstrated steady improvement throughout training without signs of overfitting.

Training Loss Progression

Initial loss: 0.9303
Final loss: 0.3715
Loss reduction: ~60%

Bias, Risks, and Limitations

Limitations

Dataset bias: Trained primarily on Codeforces problems, may not generalize well to other competitive programming platforms
Language bias: Solutions may favor certain programming patterns common in the training data
Size limitations: Being a 2B parameter model, it may struggle with very complex algorithmic problems
Code correctness: Generated code should always be tested and validated before use

Recommendations

Always test generated solutions with multiple test cases
Use the model as a starting point, not a final solution
Verify algorithmic correctness and time complexity
Consider the model's suggestions as one approach among many possible solutions

Environmental Impact

Training was conducted on a single Tesla T4 GPU for approximately 20 minutes, resulting in minimal environmental impact compared to larger scale training runs.

Hardware Type: Tesla T4 GPU
Hours used: 0.33 hours
Cloud Provider: Kaggle
Compute Region: Not specified
Carbon Emitted: Minimal (estimated < 0.1 kg CO2eq)

Technical Specifications

Model Architecture and Objective

Base Architecture: Gemma-2 (2B parameters)
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Objective: Causal language modeling with supervised fine-tuning
Quantization: 4-bit quantization during training

Compute Infrastructure

Hardware

GPU: Tesla T4 (16GB VRAM)
Platform: Kaggle Notebooks

Software

Framework: PyTorch, Transformers, PEFT, TRL
Quantization: bitsandbytes 4-bit
Training: Supervised Fine-Tuning (SFT)

Model Card Authors

Created by Aswith77 during fine-tuning experiments with competitive programming datasets.

Model Card Contact

For questions or issues regarding this model, please open an issue in the model repository.

Aswith77
/

gemma-2-2b-it-finetune-codeforces-cots