dtadpole/KernelCoder-32B-AWQ_20250621-161329
This model is a fine-tuned version of Qwen/Qwen3-32B-AWQ using Unsloth and LoRA.
Model Details
- Base Model: Qwen/Qwen3-32B-AWQ
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Max Sequence Length: 8192
- Training Examples: 24
- LoRA Rank: 64
- LoRA Alpha: 64
Training Configuration
- Epochs: 1
- Learning Rate: 3e-05
- Batch Size: 1
- Gradient Accumulation Steps: 1
Usage
from unsloth import FastLanguageModel
import torch
# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="dtadpole/KernelCoder-32B-AWQ_20250621-161329",
max_seq_length=8192,
dtype=None,
load_in_4bit=True,
)
# Enable inference mode
FastLanguageModel.for_inference(model)
# Format your prompt
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Your question here"}
]
formatted_prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
# Generate
inputs = tokenizer(formatted_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Training Data
This model was fine-tuned on processed conversation experiences for improved performance on specific tasks.
Limitations
- This is a LoRA adapter that requires the base model to function
- Performance may vary depending on the specific use case
- The model inherits any limitations from the base model
Framework Versions
- Unsloth: 2025.6.1
- Transformers: 4.52.4
- PyTorch: 2.7.0
- PEFT: Latest
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support