--- library_name: peft base_model: Qwen/Qwen3-32B-AWQ language: - en license: apache-2.0 tags: - generated_from_trainer - triton-ag - unsloth - lora --- # dtadpole/KernelCoder-32B-AWQ_20250621-161329 This model is a fine-tuned version of [Qwen/Qwen3-32B-AWQ](https://huggingface.co/Qwen/Qwen3-32B-AWQ) using Unsloth and LoRA. ## Model Details - **Base Model:** Qwen/Qwen3-32B-AWQ - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) - **Max Sequence Length:** 8192 - **Training Examples:** 24 - **LoRA Rank:** 64 - **LoRA Alpha:** 64 ## Training Configuration - **Epochs:** 1 - **Learning Rate:** 3e-05 - **Batch Size:** 1 - **Gradient Accumulation Steps:** 1 ## Usage ```python from unsloth import FastLanguageModel import torch # Load model model, tokenizer = FastLanguageModel.from_pretrained( model_name="dtadpole/KernelCoder-32B-AWQ_20250621-161329", max_seq_length=8192, dtype=None, load_in_4bit=True, ) # Enable inference mode FastLanguageModel.for_inference(model) # Format your prompt messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Your question here"} ] formatted_prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) # Generate inputs = tokenizer(formatted_prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Training Data This model was fine-tuned on processed conversation experiences for improved performance on specific tasks. ## Limitations - This is a LoRA adapter that requires the base model to function - Performance may vary depending on the specific use case - The model inherits any limitations from the base model ## Framework Versions - Unsloth: 2025.6.1 - Transformers: 4.52.4 - PyTorch: 2.7.0 - PEFT: Latest