EXAONE RAG Fine-tuned Model with LoRA

์ด ๋ชจ๋ธ์€ EXAONE-3.5-2.4B-Instruct๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ๊ตญ์–ด RAG ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํŒŒ์ธํŠœ๋‹๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

Model Details

  • Base Model: LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct
  • Fine-tuning Method: QLoRA (4-bit quantization + LoRA)
  • Task: Retrieval-Augmented Generation (RAG)
  • Language: Korean
  • Training Data: RAFT methodology based Korean RAG dataset

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# ๋ฒ ์ด์Šค ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ € ๋กœ๋“œ
base_model = AutoModelForCausalLM.from_pretrained("LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct")

# LoRA ์–ด๋Œ‘ํ„ฐ ์ ์šฉ
model = PeftModel.from_pretrained(base_model, "ryanu/my-exaone-raft-model")

# ์ถ”๋ก  ์˜ˆ์‹œ
messages = [
    {"role": "system", "content": "์ฃผ์–ด์ง„ ์ปจํ…์ŠคํŠธ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•˜์„ธ์š”."},
    {"role": "user", "content": "์ปจํ…์ŠคํŠธ: ํ•œ๊ตญ์˜ ์ˆ˜๋„๋Š” ์„œ์šธ์ž…๋‹ˆ๋‹ค. ์งˆ๋ฌธ: ํ•œ๊ตญ์˜ ์ˆ˜๋„๋Š”?""}
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(input_text, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=100, temperature=0.7)
    
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)

Training Details

  • Training Framework: Hugging Face Transformers + PEFT
  • Optimization: 8-bit AdamW
  • Learning Rate: 1e-4
  • Batch Size: 32 (with gradient accumulation)
  • Precision: FP16

Performance

์ด ๋ชจ๋ธ์€ ๋ฒ ์ด์Šค๋ผ์ธ EXAONE ๋ชจ๋ธ ๋Œ€๋น„ ํ•œ๊ตญ์–ด RAG ํƒœ์Šคํฌ์—์„œ ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ์„ ๋ณด์ž…๋‹ˆ๋‹ค. ์ž์„ธํ•œ ํ‰๊ฐ€ ๊ฒฐ๊ณผ๋Š” ํ•™์Šต ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.

Downloads last month
39
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chicksana28/exaone-lora

Adapter
(19)
this model