QwQ-32B-Preview LoRA for separating thinking/answer parts

This LoRA file was fine-tuned to make QwQ constantly separate its private thoughts from the final answer using <THINKING>...</THINKING><ANSWER>...</ANSWER> tags.

For best results, it's also recommended to add the following to the System Prompt:

Your private thoughts must be placed inside ... XML tags, and your final answer to the user must be placed inside ... XML tags. These tags MUST appear in all your responses.

This GGUF file can be used with Ollama as an adapter of the unsloth/QwQ-32B-Preview-GGUF quantized models. See the attached Modelfile for an example.

Downloads last month
13
GGUF
Model size
134M params
Architecture
qwen2

4-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for shakedzy/QwQ-32B-Preview-with-Tags-LoRA-GGUF

Base model

Qwen/Qwen2.5-32B
Adapter
(1)
this model