image/png

A model retrained by removing the last 10 layers from the original Llama-3.1-8B-Instruct model.

image/png

To retrain the knowledge held by the original language model, we conducted broad fine-tuning to revive its extensive knowledge base. Following this, we applied refined fine-tuning using high-quality datasets to enhance the model's internal and linguistic representations, thereby improving its reliability. image/png

after training the model on a specific task, we merged the pre-trained model with the task-trained model. image/png

import transformers
import torch

model_id = "kikikara/ko-llama-3.1-5b-instruct-FrankenMerging"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ์–ด ai ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค."},
    {"role": "user", "content": "์ธ์ƒ์˜ ์˜๋ฏธ๋ž€ ๋ญ์•ผ?"},
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])
Downloads last month
6
Safetensors
Model size
5.85B params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.