Qwen3-1.7B-4bit-catgirl-LoRA
Usage:
from peft import PeftModel
from transformers import TextStreamer
from unsloth import FastLanguageModel
base_model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/Qwen3-1.7B-unsloth-bnb-4bit",
max_seq_length = 2048,
load_in_4bit = True,
load_in_8bit = False,
full_finetuning = False
)
model = PeftModel.from_pretrained(base_model, "xcx0902/Qwen3-1.7B-4bit-catgirl-LoRA")
def ask_catgirl(question):
messages = [
{"role" : "user", "content" : question}
]
text = tokenizer.apply_chat_template(
messages,
tokenize = False,
add_generation_prompt = True,
enable_thinking = False
)
_ = model.generate(
**tokenizer(text, return_tensors = "pt").to("cuda"),
temperature = 0.7, top_p = 0.8, top_k = 20,
streamer = TextStreamer(tokenizer, skip_prompt = True),
)
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for xcx0902/Qwen3-1.7B-4bit-catgirl-LoRA
Base model
Qwen/Qwen3-1.7B-Base
Finetuned
Qwen/Qwen3-1.7B
Quantized
unsloth/Qwen3-1.7B-unsloth-bnb-4bit