---
license: mit
language:
- ru
- en
base_model:
- cointegrated/rubert-tiny
pipeline_tag: text-classification
---
# bad-good-text-classifier-ru-en

## Description  
This is an effective and simple neural network that can classify words as positive or negative in both Russian and English.
It is suitable for filtering chats, comments, reviews and other texts to detect toxicity or negative content. However, the model is not ideal.

## Features  
- Bilingual model (Russian(focus is on russian), English).
- Fast and accurate classification  
- Easy integration into Python projects
- Trained on a custom dataset with "good" and "bad" labels  

## Installation  
Make sure you have Python 3.7+ and the Hugging Face `transformers` package installed:  
```bash
pip install transformers torch
````

## Usage

Example of classifying a single text:

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model_name = "akaruineko/bad-good-classifier-ru_en"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

def classify_word(word):
    inputs = tokenizer(word, return_tensors="pt", truncation=True, padding=True)
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    return {"good": probs[0][1].item(), "bad": probs[0][0].item()}

def classify_text_by_words(text):
    words = text.split()
    results = {}
    for w in words:
        results[w] = classify_word(w)
    return results

if __name__ == "__main__":
    sample_text = "Example text for classification"
    results = classify_text_by_words(sample_text)
    for word, scores in results.items():
        print(f"Word: '{word}' - Good: {scores['good']:.4f}, Bad: {scores['bad']:.4f}")
```

LABEL_0 = bad, LABEL_1 = good

## Training Data

The model is trained on two datasets labeled "good" and "bad".
The data is manually prepared and includes texts in Russian and English.

## Training Results

* Epochs: 12
* Minimum loss: \~0.03
* High accuracy on test dataset

## License

MIT License.

## Contact

Questions or suggestions? Write to: [workhf@akaruineko.space](mailto:workhf@akaruineko.space)

---

Thanks for using this classifier!
Feel free to share feedback and improvement ideas.