KORMo-Team
/

KORMo-10B-base

Text Generation

Model card Files Files and versions

mjkmain commited on Oct 13

Commit

f064525

·

verified ·

1 Parent(s): 26b0cf9

Update README.md

Files changed (1) hide show

README.md +1 -4

README.md CHANGED Viewed

@@ -33,11 +33,8 @@ Our goal is to empower anyone to build and advance their own large language mode
 Key Features:
 1. A 10B-parameter Korean–English reasoning model trained entirely from scratch.
 2. 100% open resources — including all training data, code, intermediate checkpoints, and tutorials — allowing anyone to reproduce and extend a near-SOTA model on their own.
 3. 3 trillion tokens of training data released publicly, featuring never-before-shared, high-quality full-cycle Korean datasets (for pretraining, post-training, general, reasoning, and reinforcement learning).
 4. A collaborative effort by eight undergraduate and master’s students at the KAIST Graduate School of Culture Technology (MLP Lab), documented in a 45-page research paper.
 If you’ve ever used a Korean language model that performs well on benchmarks but feels strange in real use, or if fine-tuning only made it worse, you’re not alone.
@@ -45,7 +42,7 @@ If you’ve ever used a Korean language model that performs well on benchmarks b
 KORMo solves these problems head-on.
 By releasing every intermediate model and post-training dataset, we give users the freedom to build on the base model with their own data, customizing and fine-tuning it in any direction they want.
-👉 “If you want a great Korean language model, now you can build it yourself. It even works with free Colab GPUs!” 🤗
 ```
 ---

 Key Features:
 1. A 10B-parameter Korean–English reasoning model trained entirely from scratch.
 2. 100% open resources — including all training data, code, intermediate checkpoints, and tutorials — allowing anyone to reproduce and extend a near-SOTA model on their own.
 3. 3 trillion tokens of training data released publicly, featuring never-before-shared, high-quality full-cycle Korean datasets (for pretraining, post-training, general, reasoning, and reinforcement learning).
 4. A collaborative effort by eight undergraduate and master’s students at the KAIST Graduate School of Culture Technology (MLP Lab), documented in a 45-page research paper.
 If you’ve ever used a Korean language model that performs well on benchmarks but feels strange in real use, or if fine-tuning only made it worse, you’re not alone.
 KORMo solves these problems head-on.
 By releasing every intermediate model and post-training dataset, we give users the freedom to build on the base model with their own data, customizing and fine-tuning it in any direction they want.
+👉 "If you want a great Korean language model, now you can build it yourself. It even works with free Colab GPUs!" 🤗
 ```
 ---