File size: 8,160 Bytes

da25ea2
 
c6cdbc5
da25ea2
36d0f44
f02fafa
36d0f44
 
 
11815e6
d44a086
cd2ad38
 
f08c852
11815e6
6d4bfb7
 
f08c852
d40f509
853d70c
f08c852
cd2ad38
ff229f8
 
2d30a0c
 
cd2ad38
4fb524d
7cda0d0
f08c852
7cda0d0
26b0cf9
f08c852
26b0cf9
f08c852
26b0cf9
 
 
7cda0d0
26b0cf9
 
 
 
 
 
f064525
f08c852
 
d44a086
cd2ad38
d44a086
cd2ad38
0a7cdae
853d70c
 
f08c852
cd2ad38
d44a086
cd2ad38
 
853d70c
d44a086
853d70c
d44a086
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0684771
d44a086
 
 
 
 
 
 
 
 
 
 
 
853d70c
f08c852
d44a086
f08c852
d44a086
 
 
 
 
cd2ad38
d44a086
cd2ad38
f08c852
 
 
 
 
64eb221
7cda0d0
f08c852
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64eb221
f08c852
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d44a086
dbdae6d
ad821a4
ee04bff
 
 
 
 
 
ad821a4

---
library_name: transformers
license: apache-2.0
---
<!-- <p align="center">
  <img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.png?raw=true" style="width: 100%; max-width: 1100px;">
</p> -->

<p align="center">
  <img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.svg?raw=true" style="width: 40%; max-width: 1100px;">
</p>




## 🚀 Update News
- **2025-10-13**: Official release of KORMo-10B-base (Be aware that it's not an SFT model!!).
---
## 💡 About KORMo
**KORMo-10B** is a **10.8B parameter fully open LLM** capable of handling both **Korean and English**.  
The model, training code, and training data are all **fully open**, allowing anyone to reproduce and extend them.

- **Model Size**: 10.8B parameters  
- **Languages**: Korean / English  
- **Training Data**: Synthetic data + public datasets (approximately 3T tokens)
- **License**: Apache 2.0

```md
The First Fully Open-Source LLM from a Non-English Region

KORMo was created with a public-interest mission: to make world-class language models accessible to everyone.
Our goal is to empower anyone to build and advance their own large language models at a global standard.

Key Features:

1. A 10B-parameter Korean–English reasoning model trained entirely from scratch.
2. 100% open resources — including all training data, code, intermediate checkpoints, and tutorials — allowing anyone to reproduce and extend a near-SOTA model on their own.
3. 3 trillion tokens of training data released publicly, featuring never-before-shared, high-quality full-cycle Korean datasets (for pretraining, post-training, general, reasoning, and reinforcement learning).
4. A collaborative effort by eight master’s students at the KAIST Graduate School of Culture Technology (MLP Lab), documented in a 45-page research paper.

If you’ve ever used a Korean language model that performs well on benchmarks but feels strange in real use, or if fine-tuning only made it worse, you’re not alone.

KORMo solves these problems head-on.
By releasing every intermediate model and post-training dataset, we give users the freedom to build on the base model with their own data, customizing and fine-tuning it in any direction they want.

👉 "If you want a great Korean language model, now you can build it yourself. It even works with free Colab GPUs!" 🤗
```

---

## 🔗 Links

- 📖 **Technical Report**: [👉 Arxive](https://arxiv.org/pdf/2510.09426)  
- 🤗 **Hugging Face**: [👉 Model Download](https://huggingface.co/KORMo-Team)  
- 💻 **GitHub Repository**: [👉 Training and Inference Code](https://github.com/MLP-Lab/KORMo-tutorial)
- 🔉 **Tutorial**: [👉 Instruction Tuning over google colab](https://colab.research.google.com/github/MLP-Lab/KORMo-tutorial/blob/main/tutorial/02.sft_qlora.ipynb) [👉 Youtube Tutorial](https://www.youtube.com/@MLPLab)

---


## 📈 Benchmark Performance

### 📊 Quantitative Evaluation

| Benchmark | **KORMo-10B** | smolLM3-3B | olmo2-7B | olmo2-13B | kanana1.5-8B | qwen3-8B | llama3.1-8B | gemma3-4B | gemma3-12B |
|:-----------|---------------:|-----------:|---------:|---------:|------------:|--------:|-----------:|---------:|----------:|
| **🇺🇸 English Benchmarks** |||||||||||
| arc_challenge | 58.96 | 55.55 | 59.13 | 61.01 | 56.48 | 63.82 | 54.61 | 53.58 | 63.82 |
| arc_easy | 85.48 | 83.21 | 85.06 | 86.57 | 82.74 | 87.50 | 84.01 | 82.83 | 87.37 |
| boolq | 83.46 | 82.17 | 84.50 | 86.48 | 84.53 | 87.71 | 81.87 | 80.70 | 86.61 |
| copa | 93.00 | 91.00 | 92.00 | 93.00 | 88.00 | 92.00 | 93.00 | 89.00 | 95.00 |
| gpqa_main | 30.13 | 26.79 | 26.34 | 29.24 | 29.24 | 30.13 | 23.44 | 30.13 | 35.71 |
| hellaswag | 60.25 | 56.78 | 61.52 | 65.02 | 59.93 | 59.54 | 60.96 | 57.56 | 63.67 |
| mmlu | 67.96 | 61.37 | 62.81 | 66.85 | 63.73 | 76.95 | 65.03 | 59.60 | 73.58 |
| mmlu_global | 63.44 | 57.52 | 59.88 | 63.99 | 60.21 | 75.05 | 61.30 | 57.23 | 70.23 |
| mmlu_pro | 40.18 | 34.94 | 27.29 | 32.50 | 34.93 | 56.58 | 36.23 | 27.79 | 37.07 |
| mmlu_redux | 69.00 | 62.95 | 63.53 | 68.37 | 65.88 | 78.19 | 65.86 | 60.86 | 75.25 |
| openbookqa | 39.00 | 36.40 | 39.00 | 39.60 | 36.80 | 39.20 | 39.00 | 37.00 | 40.20 |
| piqa | 81.12 | 78.45 | 80.79 | 82.64 | 80.30 | 79.05 | 80.90 | 79.49 | 82.59 |
| social_iqa | 52.81 | 50.72 | 55.89 | 57.57 | 57.01 | 56.96 | 53.12 | 51.84 | 56.45 |
| **English Avg.** | **63.45** | 59.83 | 61.36 | 64.06 | 61.52 | 67.90 | 61.49 | 59.05 | 66.73 |
| **🇰🇷 Korean Benchmarks** |||||||||||
| click | 55.29 | 46.97 | 37.79 | 41.80 | 62.76 | 60.70 | 49.22 | 49.62 | 62.21 |
| csatqa | 38.00 | 26.67 | 19.33 | 24.67 | 44.67 | 52.00 | 28.67 | 28.67 | 31.33 |
| haerae | 68.29 | 55.82 | 31.62 | 37.58 | 80.75 | 67.19 | 53.25 | 60.68 | 74.34 |
| k2_eval | 84.89 | 75.23 | 49.54 | 63.43 | 84.72 | 84.72 | 76.62 | 76.39 | 85.42 |
| kobest | 75.05 | 69.13 | 57.27 | 59.02 | 81.93 | 80.05 | 70.55 | 69.33 | 77.70 |
| kobalt | 22.86 | 15.86 | 11.43 | 13.14 | 26.29 | 26.57 | 17.43 | 15.57 | 23.86 |
| kmmlu | 46.48 | 38.52 | 33.05 | 31.24 | 48.86 | 56.93 | 40.75 | 39.84 | 51.60 |
| mmlu_global (ko) | 55.16 | 44.15 | 34.00 | 36.95 | 52.65 | 61.95 | 46.34 | 46.33 | 59.68 |
| kr_clinical_qa | 77.32 | 53.97 | 48.33 | 46.22 | 65.84 | 80.00 | 63.54 | 60.00 | 77.22 |
| **Korean Avg.** | **58.15** | 47.37 | 35.82 | 39.34 | 60.94 | 63.35 | 49.60 | 49.60 | 60.37 |


### 📝 Qualitative Evaluation (LLM-as-a-Judge)

| Benchmark | KORMo-10B | smolLM3-3B | olmo2-7B | olmo2-13B | kanana1.5-8B | qwen3-8B | llama3.1-8B | exaone3.5-8B | gemma3-12B |
|:----------|---------:|----------:|---------:|---------:|------------:|--------:|------------:|-------------:|-----------:|
| MT-Bench (EN) | 8.32 | 7.15 | 7.32 | 7.64 | 8.45 | 8.70 | 6.32 | 8.15 | 8.70 |
| KO-MT-Bench (KO) | 8.54 | - | - | - | 8.02 | 8.16 | 4.27 | 8.13 | 8.51 |
| LogicKor (KO) | 8.96 | - | - | - | 8.94 | 8.63 | 6.45 | 9.20 | 8.46 |
| **Average** | **8.61** | - | - | - | **8.47** | **8.50** | **5.68** | **8.49** | **8.56** |

---

## 📦 Installation

```bash
git clone https://github.com/MLP-Lab/KORMo-tutorial.git
cd KORMo-tutorial
bash setup/create_uv_venv.sh
source .venv_kormo/bin/activate
```

---
## 🚀 Inference Example

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "KORMo-Team/KORMo-10B-sft"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

messages = [
    {"role": "user", "content": "What happens inside a black hole?"}
]

chat_prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)

inputs = tokenizer(chat_prompt, return_tensors="pt").to(model.device)

with torch.inference_mode():
    output_ids = model.generate(
        **inputs,
        max_new_tokens=1024,
    )

response = tokenizer.decode(output_ids[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print("Assistant:", response)
```

## 🧠 Enabling Thinking Mode

If you want to enable the **thinking** mode, simply set `enable_thinking=True`:

```python
chat_prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True
)
```
---



## Contact
- KyungTae Lim, Professor at KAIST. `[email protected]`



## Acknowledgments 
- This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (RS-2025-02653113, High-Performance Research AI Computing Infrastructure Support at the 2 PFLOPS Scale)


## Citation

```text
@misc{KORMo,
  author = {Minjun Kim, Hyeonseok Lim, Hangyeol Yoo, Inho Won, Seungwoo Song, Minkyung Cho, Junghun Yuk, Changsu Choi, Dongjae Shin, Huije Lee, Hoyun Song, Alice Oh and KyungTae Lim},
  title = {KORMo: Korean Open Reasoning Model for Everyone},
  year = {2025},
  publisher = {GitHub},
  journal = {Technical Report},
  paperLink = {\url{https://arxiv.org/abs/2510.09426}},
 },
}
```