EXAONE RAG Fine-tuned Model with LoRA
์ด ๋ชจ๋ธ์ EXAONE-3.5-2.4B-Instruct๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ๊ตญ์ด RAG ๋ฐ์ดํฐ์ ์ผ๋ก ํ์ธํ๋๋ ๋ชจ๋ธ์ ๋๋ค.
Model Details
- Base Model: LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct
- Fine-tuning Method: QLoRA (4-bit quantization + LoRA)
- Task: Retrieval-Augmented Generation (RAG)
- Language: Korean
- Training Data: RAFT methodology based Korean RAG dataset
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# ๋ฒ ์ด์ค ๋ชจ๋ธ๊ณผ ํ ํฌ๋์ด์ ๋ก๋
base_model = AutoModelForCausalLM.from_pretrained("LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct")
# LoRA ์ด๋ํฐ ์ ์ฉ
model = PeftModel.from_pretrained(base_model, "ryanu/my-exaone-raft-model")
# ์ถ๋ก ์์
messages = [
{"role": "system", "content": "์ฃผ์ด์ง ์ปจํ
์คํธ๋ฅผ ๋ฐํ์ผ๋ก ์ง๋ฌธ์ ๋ต๋ณํ์ธ์."},
{"role": "user", "content": "์ปจํ
์คํธ: ํ๊ตญ์ ์๋๋ ์์ธ์
๋๋ค. ์ง๋ฌธ: ํ๊ตญ์ ์๋๋?""}
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(input_text, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(inputs, max_new_tokens=100, temperature=0.7)
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)
Training Details
- Training Framework: Hugging Face Transformers + PEFT
- Optimization: 8-bit AdamW
- Learning Rate: 1e-4
- Batch Size: 32 (with gradient accumulation)
- Precision: FP16
Performance
์ด ๋ชจ๋ธ์ ๋ฒ ์ด์ค๋ผ์ธ EXAONE ๋ชจ๋ธ ๋๋น ํ๊ตญ์ด RAG ํ์คํฌ์์ ํฅ์๋ ์ฑ๋ฅ์ ๋ณด์ ๋๋ค. ์์ธํ ํ๊ฐ ๊ฒฐ๊ณผ๋ ํ์ต ๋ฆฌํฌ์งํ ๋ฆฌ๋ฅผ ์ฐธ๊ณ ํ์ธ์.
- Downloads last month
- 39
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for chicksana28/exaone-lora
Base model
LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct