--- license: mit datasets: - mizinovmv/ru_example_DeepSeek-R1-Distill-Qwen-32B - lightblue/reasoning-multilingual-R1-Llama-70B-train - Pinkstack/thinking-multilingual-30-23-small-690 - Vikhrmodels/reasoning-0.01-ru - Vikhrmodels/russian_math - kristaller486/Nebo-T1-Russian language: - ru - en base_model: - yandex/YandexGPT-5-Lite-8B-pretrain pipeline_tag: text-generation library_name: peft tags: - r1 - reasoning - think - thinking - reflection - russian - general --- # Russian r1 / YandexGPT-5-Lite-8B-pretrain LoRA-адаптер для модели [YandexGPT-5-Lite-8B-pretrain](https://huggingface.co/yandex/YandexGPT-5-Lite-8B-pretrain) обученный на миксе из датасетов реализующих r1 (ризонинг) подход. Обученная модель способна имитировать логические размышлению на русском языке по аналогии с тем, как это делает `r1` от `DeepSeek` или `o1` от `OpenAI` . W&B отчёт: https://api.wandb.ai/links/evilfreelancer/zj6s02v4 Обучение производилось при помощи утилиты [impruver](https://github.com/EvilFreelancer/impruver) используя конфигурацию [YandexGPT/8B_lora_r1](https://github.com/EvilFreelancer/impruver/blob/main/recipes/configs/YandexGPT/8B_lora_r1.yaml). На всё про всё ушло примерно 18 часов на RTX 4090, при этом понадобилось 23.5Гб видеопамяти. Эффективный контекст 1400 токенов, так как больше не удалось уместить в 24Гб VRAM. ```yaml output_dir: ./models/YandexGPT-5-Lite_7B_lora_thinking train_path: ./train.YandexGPT-5-Lite_7B_lora_thinking.jsonl val_path: ./val.YandexGPT-5-Lite_7B_lora_thinking.jsonl datasets: - name: mizinovmv/ru_example_DeepSeek-R1-Distill-Qwen-32B converter: impruver.instruction_to_messages add_global_bos: false add_global_eos: false mapping: instruction: ru_query output: response - name: lightblue/reasoning-multilingual-R1-Llama-70B-train converter: impruver.instruction_to_messages add_global_bos: false add_global_eos: false mapping: instruction: translated_prompt output: response - name: Pinkstack/thinking-multilingual-30-23-full-690 converter: impruver.instruction_to_messages add_global_bos: false add_global_eos: false - name: Vikhrmodels/reasoning-0.01-ru converter: impruver.reasoning_to_messages add_global_bos: false add_global_eos: false - name: Vikhrmodels/russian_math converter: impruver.reasoning_to_messages add_global_bos: false add_global_eos: false mapping: instruction: task reasoning: solution output: short answer - name: kristaller486/Nebo-T1-Russian converter: impruver.reasoning_to_messages add_global_bos: false add_global_eos: false mapping: instruction: prompt reasoning: think output: answer model: class: transformers.AutoModelForCausalLM name: yandex/YandexGPT-5-Lite-8B-pretrain load_in_4bit: true load_in_8bit: false dtype: bf16 lora: r: 8 # higher increases accuracy and memory lora_alpha: 16 # usually alpha=2*rank lora_dropout: 0 bias: none target_modules: [ 'q_proj', 'v_proj', 'output_proj' ] task_type: CAUSAL_LM tokenizer: class: transformers.AutoTokenizer name: yandex/YandexGPT-5-Lite-8B-pretrain max_tokens_count: 1400 special_tokens: pad_token_id: 1 pad_token: trainer: eval_strategy: steps save_strategy: steps eval_steps: 1000 save_steps: 1000 per_device_train_batch_size: 1 per_device_eval_batch_size: 1 gradient_accumulation_steps: 8 logging_steps: 10 learning_rate: 0.000005 num_train_epochs: 2 lr_scheduler_type: cosine warmup_steps: 100 optim: adamw_torch_4bit metric_for_best_model: eval_loss load_best_model_at_end: true save_total_limit: 2 seed: 42 remove_unused_columns: false max_grad_norm: 1.0 weight_decay: 0.01 torch_compile: false ```