---
license: apache-2.0
datasets:
- RussianNLP/rucola
language:
- ru
base_model:
- ai-forever/ruRoberta-large
pipeline_tag: text-classification
---

**ruRoberta-large-rucola**

This model is a fine-tuned version of ai-forever/ruRoberta-large on the RuCoLa (Russian Corpus of Linguistic Acceptability) dataset.
It predicts whether a given Russian sentence is linguistically correct or contains errors.

**Key Features**
- Task: Binary classification (acceptable vs. unacceptable)
- Training data: RuCoLa (~10k labeled sentences)
- Max sequence length: 512 tokens
- Fine-tuning framework: PyTorch + Hugging Face transformers

**Hyperparameters**
|Parameter | Value|
| --- | --- |
|Batch size| 32|
|Learning rate |1e-5|
|Epochs| 64|
|Warmup steps |100|
|Optimizer |adamw_bnb_8bit|