Model Card for Model ID

kakaocorp/kanana-nano-2.1b-instruct ๋ชจ๋ธ์— kuotient/gsm8k-ko ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ COT GRPO๋ฅผ ํ•™์Šต์‹œํ‚จ LoRA ์–ด๋Œ‘ํ„ฐ์ž…๋‹ˆ๋‹ค.

How to Get Started with the Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from peft import PeftModel

model_name = "kakaocorp/kanana-nano-2.1b-instruct"
peft_model_id = "rycont/kanana-2.1b-lora-reasoning"

base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, peft_model_id).to("cuda")
tokenizer = AutoTokenizer.from_pretrained(model_name)

streamer = TextStreamer(tokenizer)

SYSTEM_PROMPT = """
You are a helpful AI assistant developed by Kakao. Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
"""

messages = [
    {"role" : "system", "content" : SYSTEM_PROMPT},
    {"role" : "user", "content" : f"์ฒ ์ˆ˜๊ฐ€ ํ•œ ๋ณ€์˜ ๊ธธ์ด๊ฐ€ 5km์ธ ์ •์‚ฌ๊ฐํ˜• ๋ชจ์–‘์˜ ๊ณต์›์—์„œ ๋‘˜๋ ˆ๋ฅผ ๋”ฐ๋ผ ๋‚˜๋ฌด๋ฅผ ์‹ฌ์œผ๋ ค๊ณ  ํ•ด. ๋‚˜๋ฌด ์‚ฌ์ด ๊ฐ„๊ฒฉ์€ 500m์•ผ. ํ•œ ๋‚˜๋ฌด๋ฅผ ์‹ฌ์„ ๋•Œ 17๋ฒˆ์˜ ์‚ฝ์งˆ์ด ํ•„์š”ํ•œ๋ฐ, ๊ทผ๋กœ๊ธฐ์ค€๋ฒ•์ƒ ํ•œ ์‚ฌ๋žŒ์€ ์ธ์ƒ์—์„œ 31๋ฒˆ์˜ ์‚ฝ์งˆ๋ฐ–์— ๋ชปํ•ด. ๊ทธ๋ ‡๋‹ค๋ฉด ์ฒ ์ˆ˜๊ฐ€ ๋‚˜๋ฌด์‹ฌ๊ธฐ๋ฅผ ์™„๋ฃŒํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋ช‡๋ช…์˜ ์ธ๋ถ€๋ฅผ ์ถ”๊ฐ€๋กœ ๊ณ ์šฉํ•ด์•ผ ํ• ๊นŒ?"},
    {"role" : "system", "content" : "<reason> ์‹ฌํ˜ธํก ํ•˜๊ณ , ์ฐจ๊ทผ์ฐจ๊ทผ ์ƒ๊ฐํ•ด๋ณด์ž. ์ผ๋‹จ, "},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    continue_final_message=True,
    return_tensors="pt"
).to("cuda")

_ = model.eval()

# with model.disable_adapter():
with torch.no_grad():
    output = model.generate(
        input_ids,
        max_new_tokens=1024,
        streamer=streamer,
        tokenizer=tokenizer,
        stop_strings="</answer>"
    )

print(tokenizer.decode(output[0]))

Framework versions

  • PEFT 0.14.0
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for rycont/kanana-2.1b-lora-reasoning

Adapter
(1)
this model

Dataset used to train rycont/kanana-2.1b-lora-reasoning