--- library_name: transformers tags: - reasoning license: apache-2.0 datasets: - attn-signs/gromov-0 language: - ru base_model: - yandex/YandexGPT-5-Lite-8B-pretrain --- # GPT Reasoner (Base model) - [EN] Reasoning model adapted for russian text generation. **Based on YandexGPT-pretrain** - [RU] Модель рассуждений, адаптированная для генерации русскоязычного текста. **Построена на YandexGPT-pretrain** ## Model Details / Детализация модели - [EN] **Cold-start SFT version** to invoke general reasoning capabilities on a specific system prompt. This model **IS ONLY USED** for further GRPO optimizations, it cannot generate coherent russian text in this iteration. - [RU] **Версия cold-start SFT обучения** для возможностей размышления и глубокого понимания запроса. Эта модель **ИСПОЛЬЗУЕТСЯ ТОЛЬКО** для дальнейших стадий обучения с GRPO. Модель не может генерировать когерентный текст русского языка на этой итерации. ### Model Description / Описание модели - **Developed by:** [Reisen Raumberg (Attention Signs team)] - **Language(s) (NLP):** [RU/EN] - **SFT from model:** [YandexGPT-5-lite-8B-pretrain] Utilized HF.Accelerator **GPU hours**: ~3h of NVIDIA A100 Для обучения использовался HuggingFace Accelerator **GPU часы**: ~3 часа NVIDIA A100 ### Training Framework **GPTR was trained using MyLLM framework (by Attention Signs):** --==[MyLLM](https://github.com/Raumberg/myllm)==-- ### Model configuration (MyLLM Framework) Full SFT finetuning ```toml [model] model_name_or_path = "yandex/YandexGPT-5-Lite-8B-pretrain" [datasets] dataset = "attn-signs/gromov-0" conversation_field = "conversation" generate_eval_examples = false evaluation_strategy = "steps" eval_steps = 100 dataloader_num_workers = 2 remove_unused_columns = true test_size = 0.05 [run] save_strategy = "steps" save_steps = 300 save_total_limit = 3 run_name = "sft-gptr-8-run2" report_to = "wandb" logging_first_step = true logging_steps = 1 output_dir = "models/attn-signs-gptr-8-run2" project_name = "sft-gptr" [training] train_only_on_completions = true per_device_train_batch_size = 1 per_device_eval_batch_size = 1 num_train_epochs = 3 learning_rate = 0.000009 max_seq_length = 8192 gradient_accumulation_steps = 8 gradient_checkpointing = true warmup_steps = 10 bf16 = true seed = 42 use_peft = false [fusion] attn_implementation = "flash_attention_2" [tokenizer] assistant_message_template = "assistant\n" eos_token = "" pad_token = "" chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'' + message['role'] + '\n' + message['content'] + '' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ 'assistant\n' }}{% endif %}" force_chat_template = true added_special_tokens = [ "", "" ] system_prompt = """ [MODE: Reflection] """ ``` ### Using the model / Как запустить? ```python repo = 'attn-signs/GPTR-8-base' model = AutoModelForCausalLM.from_pretrained(repo) tokenizer = AutoTokenizer.from_pretrained(repo) device = 'cuda' if torch.cuda.is_available() else 'cpu' model.to(device) user_prompt = ''' У уравнений x**2 + 2019ax + b = 0 и x**2 + 2019bx + a = 0 есть один общий корень. Чему может быть равен этот корень, если известно, что a != b? ''' system_prompt = "[MODE: Reflection]" messages = [ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **model_inputs, max_new_tokens=4096 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response) ```