---
library_name: transformers
tags:
- reasoning
license: apache-2.0
datasets:
- attn-signs/gromov-0
language:
- ru
base_model:
- yandex/YandexGPT-5-Lite-8B-pretrain
---
# GPT Reasoner (Base model)
- [EN]
Reasoning model adapted for russian text generation.
**Based on YandexGPT-pretrain**
- [RU]
Модель рассуждений, адаптированная для генерации русскоязычного текста.
**Построена на YandexGPT-pretrain**
## Model Details / Детализация модели
- [EN]
**Cold-start SFT version** to invoke general reasoning capabilities on a specific system prompt.
This model **IS ONLY USED** for further GRPO optimizations, it cannot generate coherent russian text in this iteration.
- [RU]
**Версия cold-start SFT обучения** для возможностей размышления и глубокого понимания запроса.
Эта модель **ИСПОЛЬЗУЕТСЯ ТОЛЬКО** для дальнейших стадий обучения с GRPO.
Модель не может генерировать когерентный текст русского языка на этой итерации.
### Model Description / Описание модели
- **Developed by:** [Reisen Raumberg (Attention Signs team)]
- **Language(s) (NLP):** [RU/EN]
- **SFT from model:** [YandexGPT-5-lite-8B-pretrain]
Utilized HF.Accelerator
**GPU hours**: ~3h of NVIDIA A100
Для обучения использовался HuggingFace Accelerator
**GPU часы**: ~3 часа NVIDIA A100
### Training Framework
**GPTR was trained using MyLLM framework (by Attention Signs):**
--==[MyLLM](https://github.com/Raumberg/myllm)==--
### Model configuration (MyLLM Framework)
Full SFT finetuning
```toml
[model]
model_name_or_path = "yandex/YandexGPT-5-Lite-8B-pretrain"
[datasets]
dataset = "attn-signs/gromov-0"
conversation_field = "conversation"
generate_eval_examples = false
evaluation_strategy = "steps"
eval_steps = 100
dataloader_num_workers = 2
remove_unused_columns = true
test_size = 0.05
[run]
save_strategy = "steps"
save_steps = 300
save_total_limit = 3
run_name = "sft-gptr-8-run2"
report_to = "wandb"
logging_first_step = true
logging_steps = 1
output_dir = "models/attn-signs-gptr-8-run2"
project_name = "sft-gptr"
[training]
train_only_on_completions = true
per_device_train_batch_size = 1
per_device_eval_batch_size = 1
num_train_epochs = 3
learning_rate = 0.000009
max_seq_length = 8192
gradient_accumulation_steps = 8
gradient_checkpointing = true
warmup_steps = 10
bf16 = true
seed = 42
use_peft = false
[fusion]
attn_implementation = "flash_attention_2"
[tokenizer]
assistant_message_template = "assistant\n"
eos_token = ""
pad_token = ""
chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'' + message['role'] + '\n' + message['content'] + '' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ 'assistant\n' }}{% endif %}"
force_chat_template = true
added_special_tokens = [
"",
""
]
system_prompt = """
[MODE: Reflection]
"""
```
### Using the model / Как запустить?
```python
repo = 'attn-signs/GPTR-8-base'
model = AutoModelForCausalLM.from_pretrained(repo)
tokenizer = AutoTokenizer.from_pretrained(repo)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model.to(device)
user_prompt = '''
У уравнений x**2 + 2019ax + b = 0 и x**2 + 2019bx + a = 0 есть один общий корень. Чему может быть равен этот корень, если известно, что a != b?
'''
system_prompt = "[MODE: Reflection]"
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=4096
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```