metadata
license: mit
tags:
- svector
- theta-35-mini
- theta
Theta-35-mini
A distilled, lightweight version of our Theta-35 main model, built on the Qwen architecture and distilled with the GRPO technique for high efficiency and strong performance in a compact footprint.
Model Description
Theta-35-mini is a small-footprint autoregressive language model distilled from our flagship Theta-35 model. We leveraged:
- Qwen Model Architecture: Starting from the Qwen2 base, adapting its efficient transformer blocks and optimized attention kernels.
- GRPO Distillation: Guided Representation Projection Optimization (GRPO) to transfer knowledge from Theta-35 to Theta-35-mini, preserving accuracy while drastically reducing parameter count.
This makes Theta-35-mini ideal for on-device inference, low-latency applications, and scenarios with tight compute or memory budgets.
Intended Uses
- On-device text generation (mobile apps, embedded systems)
- Real-time chatbots and conversational agents
- Edge AI applications with strict resource constraints
Usage
# Install transformers
pip install transformers
# Load the model
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
model = AutoModelForCausalLM.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
# Generate text
inputs = tokenizer("Once upon a time", return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))