RRT1-3B

A fine-tuned 3B parameter model specialized for reasoning and chain-of-thought tasks

Model Details

This model is a fine-tuned version of unsloth/Qwen2.5-3B-Instruct-bnb-4bit using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training.

Developed by: theprint
Model type: Causal Language Model (Fine-tuned with LoRA)
Language: en
License: apache-2.0
Base model: unsloth/Qwen2.5-3B-Instruct-bnb-4bit
Fine-tuning method: LoRA with rank 128

Intended Use

Reasoning, chain-of-thought, and general instruction following

Training Details

Training Data

ShareGPT conversations with chain-of-thought reasoning examples

Dataset: AiCloser/sharegpt_cot_dataset
Format: sharegpt

Training Procedure

Training epochs: 3
LoRA rank: 128
Learning rate: 0.0002
Batch size: 4
Framework: Unsloth + transformers + PEFT
Hardware: NVIDIA RTX 5090

Usage

from unsloth import FastLanguageModel
import torch

# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="theprint/RRT1-3B",
    max_seq_length=4096,
    dtype=None,
    load_in_4bit=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

# Example usage
inputs = tokenizer(["Your prompt here"], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

GGUF Quantized Versions

Quantized GGUF versions are available in the gguf/ directory for use with llama.cpp:

RRT1-3B-q4_k_m.gguf - 4-bit quantization (recommended for most use cases)
RRT1-3B-q5_k_m.gguf - 5-bit quantization (higher quality)
RRT1-3B-q8_0.gguf - 8-bit quantization (highest quality)

Limitations

May hallucinate or provide incorrect information. Not suitable for critical decision making.

Citation

If you use this model, please cite:

@misc{rrt1_3b,
  title={RRT1-3B: Fine-tuned Qwen2.5-3B-Instruct-bnb-4bit},
  author={theprint},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/theprint/RRT1-3B}
}

theprint
/

RRT1-3B