--- license: mit language: - ru - en base_model: - cointegrated/rubert-tiny pipeline_tag: text-classification --- # bad-good-text-classifier-ru-en ## Description This is an effective and simple neural network that can classify words as positive or negative in both Russian and English. It is suitable for filtering chats, comments, reviews and other texts to detect toxicity or negative content. However, the model is not ideal. ## Features - Bilingual model (Russian(focus is on russian), English). - Fast and accurate classification - Easy integration into Python projects - Trained on a custom dataset with "good" and "bad" labels ## Installation Make sure you have Python 3.7+ and the Hugging Face `transformers` package installed: ```bash pip install transformers torch ```` ## Usage Example of classifying a single text: ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch model_name = "akaruineko/bad-good-classifier-ru_en" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name) def classify_word(word): inputs = tokenizer(word, return_tensors="pt", truncation=True, padding=True) outputs = model(**inputs) probs = torch.softmax(outputs.logits, dim=1) return {"good": probs[0][1].item(), "bad": probs[0][0].item()} def classify_text_by_words(text): words = text.split() results = {} for w in words: results[w] = classify_word(w) return results if __name__ == "__main__": sample_text = "Example text for classification" results = classify_text_by_words(sample_text) for word, scores in results.items(): print(f"Word: '{word}' - Good: {scores['good']:.4f}, Bad: {scores['bad']:.4f}") ``` LABEL_0 = bad, LABEL_1 = good ## Training Data The model is trained on two datasets labeled "good" and "bad". The data is manually prepared and includes texts in Russian and English. ## Training Results * Epochs: 12 * Minimum loss: \~0.03 * High accuracy on test dataset ## License MIT License. ## Contact Questions or suggestions? Write to: [workhf@akaruineko.space](mailto:workhf@akaruineko.space) --- Thanks for using this classifier! Feel free to share feedback and improvement ideas.