ZTE-AIM/NTele-R1-32B-V1 · Hugging Face

Achieving Superior Performance over Qwen3-32B and QwQ-32B Using Only 800 Strategically Curated Samples

Model description

NTele-R1-32B-V1 is the continuation of NTele-R1-32B-Preview, you can visit for more information. We have made great improvements on the base by using less corpus in mathematics and code (only 800 items, including 400 mathematics and 400 codes), and surpassed the industry's advanced models Qwen3-32B and QwQ-32B.

Model	Release Date	AIME2024	AIME2025	MATH500	GPQA-Diamond	LCB（24.08-25.02）
DeepSeek-R1-Distill-Qwen-32B	25.1.20	64.17	55.21	89.8	62.1	50.26
QwQ-32B	25.3.6	76.25	67.30	94.6	63.6	60.94
Qwen3-32B(think)	25.4.29	78.75	73.33	95	69.7	53.24
NTele-R1-32B-V1(ours)	25.5.10	82.5	74.49	95.2	67.17	63.69

Data

[🤗 Codemath400]

You can access our dataset to get 800 training data and visit the NTele-R1-32B-Preview to learn about the data synthesis and screening process.

Evaluation

We evaluate models with SkyThought.

Training Details

NTele-R1-32B-V1 was trained from DeepSeek-32B-Distill on 8xH800.

Training hyperparameter

learning_rate: 1e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 6
total_train_batch_size: 48
total_eval_batch_size: 48
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0