File size: 4,830 Bytes
7789304
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cd7e481
7789304
 
 
 
 
ca136bc
 
7789304
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
license: apache-2.0
language:
- en
pipeline_tag: text-generation
base_model: SVECTOR/Theta-35
tags:
- chat
- reasoning
library_name: transformers
---

# Theta-35

## Introduction

Theta-35 is the advanced reasoning model in the Theta series by SVECTOR. Compared with conventional instruction-tuned models, Theta-35, which specializes in complex thinking and reasoning, achieves significantly enhanced performance in downstream tasks, particularly for challenging problems requiring deep logical analysis and multistep reasoning.

<p align="center">
  <img width="100%" src="images/Benchmark.png">
</p>

**This repo contains the Theta-35 model**, which has the following features:
- Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
- Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
- Number of Parameters: 33B
- Number of Parameters (Non-Embedding): 33B
- Number of Layers: 64
- Number of Attention Heads (GQA): 40 for Q and 8 for KV
- Context Length: Full 131,072 tokens
- Sliding Window: 32,768 tokens

**Note:** For the best experience, please review the [usage guidelines](#usage-guidelines) before deploying Theta models.

For more details, please refer to our [documentation](https://www.svector.co.in/models/theta-35).

## Requirements

Theta-35 requires the latest version of Hugging Face `transformers`. We advise you to use version 4.43.1 or newer.

With older versions of transformers, you may encounter the following error:
```
KeyError: 'theta'
```

## Quickstart

Here is a code snippet showing how to load the tokenizer and model, and how to generate content:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer directly
model_name = "SVECTOR-CORPORATION/Theta-35"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare prompt
prompt = "How many planets are in our solar system? Explain your reasoning."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True  # This will automatically add "<reasoning>" tag
)

# Generate response
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.95,
    top_k=30
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

# Decode and print response
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```

### Usage Guidelines

To achieve optimal performance with Theta-35, we recommend the following settings:

1. **Enforce Thoughtful Output**: Ensure the model starts with "\<reasoning\>\n" to promote step-by-step thinking, which enhances output quality. If you use `apply_chat_template` and set `add_generation_prompt=True`, this is automatically implemented.

2. **Sampling Parameters**:
   - Use Temperature=0.6 and TopP=0.95 instead of Greedy decoding to avoid repetitions.
   - Use TopK between 20 and 40 to filter out rare token occurrences while maintaining diversity.

3. **Standardize Output Format**: We recommend using prompts to standardize model outputs when benchmarking.
   - **Math Problems**: Include "Please reason step by step, and put your final answer within \boxed{}." in the prompt.
   - **Multiple-Choice Questions**: Add "Please show your choice in the `answer` field with only the choice letter, e.g.,`\"answer\": \"C\"`." to the prompt.

4. **Handle Long Inputs**: For inputs exceeding 32,768 tokens, enable sliding window attention to improve the model's ability to process long sequences efficiently.

For supported frameworks, you could add the following to `config.json` to enable extended context handling:
```json
{
  ...,
  "use_sliding_window": true,
  "sliding_window": 32768
}
```

## Evaluation & Performance

Theta-35 demonstrates exceptional performance across various reasoning tasks, including:

- Mathematical reasoning
- Logical deduction
- Multi-step problem solving
- Code understanding and generation
- Scientific reasoning

Detailed evaluation results are reported in our [documentation](https://www.svector.co.in/models/theta-35).

## Citation

If you find our work helpful, feel free to give us a cite.

```
@misc{theta35,
    title = {Theta-35: Advanced Reasoning in Large Language Models},
    url = {https://www.svector.co.in/models/theta-35},
    author = {SVECTOR Team},
    month = {March},
    year = {2025}
}

@article{theta,
      title={Theta Technical Report}, 
      author={SVECTOR Research Team},
      year={2025}
}
```