File size: 4,833 Bytes
7789304 bb35872 7789304 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
---
license: apache-2.0
language:
- en
pipeline_tag: text-generation
base_model: SVECTOR/Theta-35
tags:
- chat
- reasoning
library_name: transformers
---
# Theta-35
## Introduction
Theta-35 is the advanced reasoning model in the Theta series by SVECTOR. Compared with conventional instruction-tuned models, Theta-35, which specializes in complex thinking and reasoning, achieves significantly enhanced performance in downstream tasks, particularly for challenging problems requiring deep logical analysis and multistep reasoning.
<p align="center">
<img width="100%" src="figures/benchmark.png">
</p>
**This repo contains the Theta-35 model**, which has the following features:
- Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
- Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
- Number of Parameters: 35B
- Number of Parameters (Non-Embedding): 33.5B
- Number of Layers: 64
- Number of Attention Heads (GQA): 40 for Q and 8 for KV
- Context Length: Full 131,072 tokens
- Sliding Window: 32,768 tokens
**Note:** For the best experience, please review the [usage guidelines](#usage-guidelines) before deploying Theta models.
For more details, please refer to our [documentation](https://www.svector.co.in/models/theta-35).
## Requirements
Theta-35 requires the latest version of Hugging Face `transformers`. We advise you to use version 4.43.1 or newer.
With older versions of transformers, you may encounter the following error:
```
KeyError: 'theta'
```
## Quickstart
Here is a code snippet showing how to load the tokenizer and model, and how to generate content:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer directly
model_name = "SVECTOR-CORPORATION/Theta-35"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Prepare prompt
prompt = "How many planets are in our solar system? Explain your reasoning."
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True # This will automatically add "<reasoning>" tag
)
# Generate response
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=32768,
temperature=0.6,
top_p=0.95,
top_k=30
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
# Decode and print response
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```
### Usage Guidelines
To achieve optimal performance with Theta-35, we recommend the following settings:
1. **Enforce Thoughtful Output**: Ensure the model starts with "\<reasoning\>\n" to promote step-by-step thinking, which enhances output quality. If you use `apply_chat_template` and set `add_generation_prompt=True`, this is automatically implemented.
2. **Sampling Parameters**:
- Use Temperature=0.6 and TopP=0.95 instead of Greedy decoding to avoid repetitions.
- Use TopK between 20 and 40 to filter out rare token occurrences while maintaining diversity.
3. **Standardize Output Format**: We recommend using prompts to standardize model outputs when benchmarking.
- **Math Problems**: Include "Please reason step by step, and put your final answer within \boxed{}." in the prompt.
- **Multiple-Choice Questions**: Add "Please show your choice in the `answer` field with only the choice letter, e.g.,`\"answer\": \"C\"`." to the prompt.
4. **Handle Long Inputs**: For inputs exceeding 32,768 tokens, enable sliding window attention to improve the model's ability to process long sequences efficiently.
For supported frameworks, you could add the following to `config.json` to enable extended context handling:
```json
{
...,
"use_sliding_window": true,
"sliding_window": 32768
}
```
## Evaluation & Performance
Theta-35 demonstrates exceptional performance across various reasoning tasks, including:
- Mathematical reasoning
- Logical deduction
- Multi-step problem solving
- Code understanding and generation
- Scientific reasoning
Detailed evaluation results are reported in our [documentation](https://www.svector.co.in/models/theta-35).
## Citation
If you find our work helpful, feel free to give us a cite.
```
@misc{theta35,
title = {Theta-35: Advanced Reasoning in Large Language Models},
url = {https://www.svector.co.in/models/theta-35},
author = {SVECTOR Team},
month = {March},
year = {2025}
}
@article{theta,
title={Theta Technical Report},
author={SVECTOR Research Team},
year={2025}
}
``` |