|
--- |
|
license: mit |
|
tags: |
|
- svector |
|
- theta-35-mini |
|
- theta |
|
--- |
|
|
|
# Theta-35-mini |
|
|
|
A distilled, lightweight version of our Theta-35 main model, built on the Qwen architecture and distilled with the GRPO technique for high efficiency and strong performance in a compact footprint. |
|
|
|
## Model Description |
|
|
|
**Theta-35-mini** is a small-footprint autoregressive language model distilled from our flagship Theta-35 model. We leveraged: |
|
|
|
- **Qwen Model Architecture**: Starting from the Qwen2 base, adapting its efficient transformer blocks and optimized attention kernels. |
|
- **GRPO Distillation**: Guided Representation Projection Optimization (GRPO) to transfer knowledge from Theta-35 to Theta-35-mini, preserving accuracy while drastically reducing parameter count. |
|
|
|
This makes Theta-35-mini ideal for on-device inference, low-latency applications, and scenarios with tight compute or memory budgets. |
|
|
|
## Intended Uses |
|
|
|
- **On-device text generation** (mobile apps, embedded systems) |
|
- **Real-time chatbots** and conversational agents |
|
- **Edge AI** applications with strict resource constraints |
|
|
|
## Usage |
|
|
|
```bash |
|
# Install transformers |
|
pip install transformers |
|
|
|
# Load the model |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini") |
|
model = AutoModelForCausalLM.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini") |
|
|
|
# Generate text |
|
inputs = tokenizer("Once upon a time", return_tensors="pt") |
|
outputs = model.generate(**inputs, max_length=100, temperature=0.7) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|