Introduction

Tina (Tiny Reasoning Models via LoRA) models are all fine-tuned adapters on the base model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B. This LoRA adapter in this repo is fine-tuned with the dataset open-r1/OpenR1-Math-220k. Please refer to our paper Tina: Tiny Reasoning Models via LoRA for more training details.

Example Usage

The Tina model is meant to be used in combination with the base model as a standard adapter. Particularly, we release all checkpoints we have for each Tina model and one could select different checkpoint to use by specifying the subfolder.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained(
  "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
  device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
  "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"
)

model = PeftModel.from_pretrained(
  base_model,
  "Tina-Yi/R1-Distill-Qwen-1.5B-OpenR1",
  subfolder="checkpoint-1500" # checkpoint 1500 is the best
)
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Tina-Yi/R1-Distill-Qwen-1.5B-OpenR1

Adapter
(120)
this model

Dataset used to train Tina-Yi/R1-Distill-Qwen-1.5B-OpenR1

Collection including Tina-Yi/R1-Distill-Qwen-1.5B-OpenR1