---
library_name: peft
base_model: Qwen/Qwen3-32B-AWQ
language:
- en
license: apache-2.0
tags:
- generated_from_trainer
- triton-ag
- unsloth
- lora
---

# dtadpole/KernelCoder-32B-AWQ_20250621-161329

This model is a fine-tuned version of [Qwen/Qwen3-32B-AWQ](https://huggingface.co/Qwen/Qwen3-32B-AWQ) using Unsloth and LoRA.

## Model Details

- **Base Model:** Qwen/Qwen3-32B-AWQ
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
- **Max Sequence Length:** 8192
- **Training Examples:** 24
- **LoRA Rank:** 64
- **LoRA Alpha:** 64

## Training Configuration

- **Epochs:** 1
- **Learning Rate:** 3e-05
- **Batch Size:** 1
- **Gradient Accumulation Steps:** 1

## Usage

```python
from unsloth import FastLanguageModel
import torch

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="dtadpole/KernelCoder-32B-AWQ_20250621-161329",
    max_seq_length=8192,
    dtype=None,
    load_in_4bit=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

# Format your prompt
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Your question here"}
]

formatted_prompt = tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=True
)

# Generate
inputs = tokenizer(formatted_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## Training Data

This model was fine-tuned on processed conversation experiences for improved performance on specific tasks.

## Limitations

- This is a LoRA adapter that requires the base model to function
- Performance may vary depending on the specific use case
- The model inherits any limitations from the base model

## Framework Versions

- Unsloth: 2025.6.1
- Transformers: 4.52.4
- PyTorch: 2.7.0
- PEFT: Latest