Introduction

We introduce Llama-Thunder-LLM, a new language model of Thunder Research Group, specialized in Korean and English.

Training Platform

Llama-Thunder-LLM is trained on Thunder-LLM-Toolkit.

Details

More comprehensive details about Llama-Thunder-LLM are provided in our paper. Please refer to our paper on arXiv for in-depth information.

Release Date

2025.06.18

Korean Benchmark Performance

The best performance in each row is marked in bold with an asterisk(*).

Benchmark Llama-Thunder-LLM LLaMA 3.1 8B Base Model LLaMA 3.1 8B Instruct Exaone-3.5 8B Ins QWen2.5 7B Instruct
KoBEST-HellaSwag (0-shot) 72.4* 58.2 55.8 60.0 58.2
SNU_Ko-WinoGrande (5-shot) 74.3* 60.6 60.2 65.3 63.7
SNU_Ko-LAMBADA (0-shot) 86.8* 84.3 83.8 85.7 81.7
SNU_Ko-ARC-Easy (5-shot) 76.1 63.3 64.4 76.7* 69.4
SNU_Ko-ARC-Challenge (5-shot) 62.4* 44.6 45.7 57.0 54.5
KMMLU (5-shot) 47.6 40.5 41.1 45.1 49.6*
SNU_Ko-GSM8K (5-shot) 57.3 34.6 53.1 56.7 67.3*
SNU_Ko-IFEval (0-shot) 51.5 30.7 43.4 67.9* 60.5
KR-HumanEval (0-shot) 56.7 21.9 42.1 61.0* 28.1
Average 65.0* 48.7 54.4 63.9 59.2

English Benchmark Performance

The best performance in each row is marked in bold with an asterisk(*).

Benchmark Llama-Thunder-LLM LLaMA 3.1 8B Base Model LLaMA 3.1 8B Instruct Exaone-3.5 8B Ins QWen2.5 7B Instruct
HellaSwag (0-shot) 89.3* 79.0 79.2 77.9 80.4
WinoGrande (5-shot) 89.4* 77.0 78.1 74.4 74.6
LAMBADA (0-shot) 64.0* 44.8 43.0 46.4 48.6
ARC-Easy (5-shot) 91.3 91.5 93.3 95.4 96.7*
ARC-Challenge (5-shot) 80.3 79.5 83.2 85.5 90.3*
MMLU (5-shot) 63.1 65.3 68.0 65.2 74.2*
GSM-8K (5-shot) 76.5 57.2 77.2 73.7 83.1*
IFEval (0-shot) 59.1 18.7 61.4 78.4* 74.8
HumanEval (0-shot) 59.1 34.8 57.3 66.5* 64.6
Average 74.7 60.9 71.2 73.7 76.4*

How to use

Use with transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "thunder-research-group/Llama-Thunder-LLM-8B" 

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16) 

if torch.cuda.is_available():
    model.to("cuda")

# Example prompt for text generation
prompt = "ํ•œ๊ตญ์–ด์™€ ์˜์–ด๋Š” ์–ด๋–ป๊ฒŒ ๋‹ค๋ฅธ๊ฐ€์š”?"

input_ids = tokenizer.encode(prompt, return_tensors="pt")
if torch.cuda.is_available():
    input_ids = input_ids.to("cuda")

# Generate text
outputs = model.generate(input_ids, max_new_tokens=200, num_return_sequences=1)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(generated_text)

Use with Thunder-LLM-Toolkit

For more advanced usage, fine-tuning, or specific functionalities related to model development, you can refer to the Thunder-LLM-Toolkit GitHub repository.

License

This repository contains original work licensed under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Notice

In accordance with the Llama 3.1 license, this work is a derivative of the Llama Materials. As required by the Meta Llama 3.1 license policy, we provide the following notices and include a copy of the original license:

Additional Information

Citation

If you use this model, please cite:

@article{kim2025thunder,
  title={Thunder-LLM: Efficiently Adapting LLMs to Korean with Minimal Resources},
  author={Kim, Jinpyo and Cho, Gyeongje and Park, Chanwoo and Park, Jongwon and Kim, Jongmin and So, Yeonkyoung and Lee, Jaejin},
  journal={arXiv preprint arXiv:2506.21595},
  year={2025}
}
Downloads last month
32
Safetensors
Model size
9B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for thunder-research-group/Llama-Thunder-LLM-8B

Finetuned
(1590)
this model