YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

MISO-BR Misogyny Classifier

This model classifies text in Brazilian Portuguese as misogynistic or non-misogynistic. It's trained on the MISO-BR dataset.

Model Details

  • Model Type: TF-IDF + RandomForest classifier
  • Language: Portuguese (Brazil)
  • Task: Binary classification (misogynistic vs non-misogynistic content)
  • Framework: scikit-learn

Performance

The model was evaluated on a test set and achieved:

  • F1 Score (macro): 0.6758
  • Accuracy: 0.6778
  • AUC: 0.7314

Requirements

This project requires the following libraries:

  • scikit-learn==1.7.0
  • spacy==3.7.2
  • joblib>=1.3.0
  • pt_core_news_sm (downloadable from here)

Install the dependencies using the requirements.txt file:

pip install -r requirements.txt

Usage

from huggingface_hub import hf_hub_download
import joblib
import spacy

# Download the model from Hugging Face Hub
model_path = hf_hub_download(repo_id="fabiopassos/miso-br-classifier", 
                             filename="models/miso_br_rf_classifier.joblib")

# Load the model
model = joblib.load(model_path)

# Load spaCy for Portuguese
nlp = spacy.load("pt_core_news_sm")

# Preprocess function
def preprocess_text(text):
    doc = nlp(text)
    tokens = [token.lemma_.lower() for token in doc 
              if not token.is_stop and not token.is_punct and token.is_alpha]
    return " ".join(tokens)

# Example text
text = "Seu texto para classificar aqui"
processed_text = preprocess_text(text)

# Predict
prediction = model.predict([processed_text])[0]
probability = model.predict_proba([processed_text])[0][1]

print(f"Texto: {text}")
print(f"É misógino: {'Sim' if prediction == 1 else 'Não'}")
print(f"Probabilidade: {probability:.4f}")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support