Tina-Yi/R1-Distill-Qwen-1.5B-OpenR1

Introduction

Tina (Tiny Reasoning Models via LoRA) models are all fine-tuned adapters on the base model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B. This LoRA adapter in this repo is fine-tuned with the dataset open-r1/OpenR1-Math-220k. Please refer to our paper Tina: Tiny Reasoning Models via LoRA for more training details.

Example Usage

The Tina model is meant to be used in combination with the base model as a standard adapter. Particularly, we release all checkpoints we have for each Tina model and one could select different checkpoint to use by specifying the subfolder.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained(
  "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
  device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
  "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"
)

model = PeftModel.from_pretrained(
  base_model,
  "Tina-Yi/R1-Distill-Qwen-1.5B-OpenR1",
  subfolder="checkpoint-1500" # checkpoint 1500 is the best
)

Tina-Yi
/

R1-Distill-Qwen-1.5B-OpenR1

Introduction

Example Usage

Model tree for Tina-Yi/R1-Distill-Qwen-1.5B-OpenR1

Dataset used to train Tina-Yi/R1-Distill-Qwen-1.5B-OpenR1

Collection including Tina-Yi/R1-Distill-Qwen-1.5B-OpenR1

Tina - Ablation Studies