Introduction

We introduce Llama-Thunder-LLM, a new language model of Thunder Research Group, specialized in Korean and English.

Training Platform

Llama-Thunder-LLM is trained on Thunder-LLM-Toolkit.

Details

More comprehensive details about Llama-Thunder-LLM are provided in our paper. Please refer to our paper on arXiv for in-depth information.

Release Date

2025.06.18

Korean Benchmark Performance

The best performance in each row is marked in bold with an asterisk(*).

Benchmark	Llama-Thunder-LLM	LLaMA 3.1 8B Base Model	LLaMA 3.1 8B Instruct	Exaone-3.5 8B Ins	QWen2.5 7B Instruct
KoBEST-HellaSwag (0-shot)	72.4*	58.2	55.8	60.0	58.2
SNU_Ko-WinoGrande (5-shot)	74.3*	60.6	60.2	65.3	63.7
SNU_Ko-LAMBADA (0-shot)	86.8*	84.3	83.8	85.7	81.7
SNU_Ko-ARC-Easy (5-shot)	76.1	63.3	64.4	76.7*	69.4
SNU_Ko-ARC-Challenge (5-shot)	62.4*	44.6	45.7	57.0	54.5
KMMLU (5-shot)	47.6	40.5	41.1	45.1	49.6*
SNU_Ko-GSM8K (5-shot)	57.3	34.6	53.1	56.7	67.3*
SNU_Ko-IFEval (0-shot)	51.5	30.7	43.4	67.9*	60.5
KR-HumanEval (0-shot)	56.7	21.9	42.1	61.0*	28.1
Average	65.0*	48.7	54.4	63.9	59.2

English Benchmark Performance

The best performance in each row is marked in bold with an asterisk(*).

Benchmark	Llama-Thunder-LLM	LLaMA 3.1 8B Base Model	LLaMA 3.1 8B Instruct	Exaone-3.5 8B Ins	QWen2.5 7B Instruct
HellaSwag (0-shot)	89.3*	79.0	79.2	77.9	80.4
WinoGrande (5-shot)	89.4*	77.0	78.1	74.4	74.6
LAMBADA (0-shot)	64.0*	44.8	43.0	46.4	48.6
ARC-Easy (5-shot)	91.3	91.5	93.3	95.4	96.7*
ARC-Challenge (5-shot)	80.3	79.5	83.2	85.5	90.3*
MMLU (5-shot)	63.1	65.3	68.0	65.2	74.2*
GSM-8K (5-shot)	76.5	57.2	77.2	73.7	83.1*
IFEval (0-shot)	59.1	18.7	61.4	78.4*	74.8
HumanEval (0-shot)	59.1	34.8	57.3	66.5*	64.6
Average	74.7	60.9	71.2	73.7	76.4*

How to use

Use with transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "thunder-research-group/Llama-Thunder-LLM-8B" 

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16) 

if torch.cuda.is_available():
    model.to("cuda")

# Example prompt for text generation
prompt = "한국어와 영어는 어떻게 다른가요?"

input_ids = tokenizer.encode(prompt, return_tensors="pt")
if torch.cuda.is_available():
    input_ids = input_ids.to("cuda")

# Generate text
outputs = model.generate(input_ids, max_new_tokens=200, num_return_sequences=1)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(generated_text)

Use with Thunder-LLM-Toolkit

For more advanced usage, fine-tuning, or specific functionalities related to model development, you can refer to the Thunder-LLM-Toolkit GitHub repository.

License

This repository contains original work licensed under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Creative Commons Attribution-ShareAlike 4.0 License:
https://creativecommons.org/licenses/by-nc-sa/4.0/

Notice

In accordance with the Llama 3.1 license, this work is a derivative of the Llama Materials. As required by the Meta Llama 3.1 license policy, we provide the following notices and include a copy of the original license:

Built with Llama
Meta Llama 3.1 Community License:
https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE

Additional Information

Citation

If you use this model, please cite:

@article{kim2025thunder,
  title={Thunder-LLM: Efficiently Adapting LLMs to Korean with Minimal Resources},
  author={Kim, Jinpyo and Cho, Gyeongje and Park, Chanwoo and Park, Jongwon and Kim, Jongmin and So, Yeonkyoung and Lee, Jaejin},
  journal={arXiv preprint arXiv:2506.21595},
  year={2025}
}

Downloads last month: 32

Safetensors

Model size

9B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for thunder-research-group/Llama-Thunder-LLM-8B

Base model

meta-llama/Llama-3.1-8B

Finetuned

(1590)

this model