File size: 6,343 Bytes
2898468 0f52654 2898468 e79166d 8738268 e79166d 4812a9f 210ae1a 091736d e79166d 9d76a87 e79166d 83f881f 4df46b0 ba61b5c 4df46b0 091736d deb53f7 091736d e79166d 7cfc6f6 e79166d 091736d 5edf18f 091736d e79166d f4cd357 e79166d cd23a53 e79166d 091736d e79166d 0e11f23 e79166d d8beba5 c326308 d8beba5 e79166d cd23a53 e79166d a668bb4 51f3b16 a668bb4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
---
license: mit
language:
- zh
- en
base_model:
- inclusionAI/Ling-lite-base-1.5
new_version: inclusionAI/Ring-lite-2507
pipeline_tag: text-generation
library_name: transformers
---
# Ring-lite
<p align="center">
<img src="https://huggingface.co/inclusionAI/Ring-lite/resolve/main/ant-bailing.png" width="100"/>
<p>
<p align="center">
🤗 <a href="https://huggingface.co/inclusionAI">Hugging Face</a>
<p>
## Introduction
Ring-lite is a lightweight, fully open-sourced MoE (Mixture of Experts) LLM designed for complex reasoning tasks. It is built upon the publicly available [Ling-lite-1.5](https://huggingface.co/inclusionAI/Ling-lite-1.5) model, which has 16.8B parameters with 2.75B activated parameters.. We use a joint training pipeline combining knowledge distillation with reinforcement learning, achieving performance comparable to state-of-the-art (SOTA) small-size reasoning models on challenging benchmarks (AIME, LiveCodeBench, and GPQA-Diamond) while activating only one-third of their parameters.
## News
[20250704] Ring-lite-0704: we update Ring-lite model, which supports two distinct reasoning modes: "**thinking on**" and "**thinking off**".
## Model Downloads
<div align="center">
| **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
| :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
| Ring-lite | 16.8B | 2.75B | 128K | [🤗 HuggingFace](https://huggingface.co/inclusionAI/Ring-lite) |
</div>
## Evaluation
For a comprehensive evaluation of the quality of our reasoning models, we implemented automatic benchmarks to assess their performance including math, code and science.
<p align="center">
<img src="https://huggingface.co/inclusionAI/Ring-lite/resolve/main/performance.png" width="1000"/>
<p>
To compare the performance of Ring-lite-0704 and Ring-lite-0616, we evaluate the two models on a broader range of reasoning and general-purpose benchmarks, including instruction following, function calling, and creative writing.
| **Dataset** | **Ring-lite-0616** | **Ring-lite-0704** |
| :---------: | :----------------: | :----------------: |
| AIME 2024 | 76.6 | 79.0 |
| AIME 2025 | 69.1 | 69.5 |
| LiveCodeBench | 60.7 | 61.4 |
| Codeforces (percentile) | 86.5 | 88.0 |
| GPQA Diamond | 61.1 | 63.2 |
| C-Eval | 59.0 | 65.4 |
| MMLU-Pro | 60.0 | 63.0 |
| ArenaHard | 27.8 | 62.7 |
| IF-Eval | 51.6 | 54.3 |
| BFCL_Live | 60.1 | 66.8 |
| Creative Writing | 6.7 | 60.2 |
More details are reported in our [technical report](https://arxiv.org/abs/2506.14731).
## Quickstart
### 🤗 Hugging Face Transformers
The newly updated **Ring-lite** model now supports two distinct reasoning modes: "**thinking on**" and "**thinking off**". These modes are controlled by the `enable_thinking` parameter in the `tokenizer.apply_chat_template()` function.
* When `enable_thinking` is set to `True` (or omitted), the model operates in "**thinking on**" mode, where it generates and outputs the internal reasoning process.
* When `enable_thinking` is explicitly set to `False`, the model runs in "**thinking off**" mode, skipping the reasoning step entirely and directly producing the final answer.
This feature allows users to choose between detailed reasoning and concise output based on their specific needs.
Here is a code snippet to show you how to use the chat model with `transformers`:
#### Thinking on
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "inclusionAI/Ring-lite"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Give me a short introduction to large language models."
messages = [
{"role": "system", "content": "You are Ring, an assistant created by inclusionAI"},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=8192
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```
#### Thinking off
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "inclusionAI/Ring-lite"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Give me a short introduction to large language models."
messages = [
{"role": "system", "content": "You are Ring, an assistant created by inclusionAI"},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=8192
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```
## Dataset
The training data of Ring-lite is release at [Ring-lite-sft-data](https://huggingface.co/datasets/inclusionAI/Ring-lite-sft-data) and [Ring-lite-rl-data](https://huggingface.co/datasets/inclusionAI/Ring-lite-rl-data).
## Code
The training code will be released soon.
## Deployment
Please refer to [GitHub](https://github.com/inclusionAI/Ring/blob/main/README.md)
## License
This code repository is licensed under [the MIT License](https://huggingface.co/inclusionAI/Ring-lite/blob/main/LICENSE).
## Citation
```
@misc{ringteam2025ringlitescalablereasoningc3postabilized,
title={Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs},
author={Ling Team},
year={2025},
eprint={2506.14731},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.14731},
}
``` |