🧠 MindPadi: Hybrid Classifier Suite

This repository contains auxiliary models for intent and emotion classification used in the MindPadi mental health assistant. These models include rule-based, ML-based, and deep learning classifiers trained to detect emotional states, user intent, and conversational cues.

📦 Files

File	Description
`intent_clf.joblib`	scikit-learn pipeline for intent classification (TF-IDF)
`intent_sentence_classifier.pkl`	Sentence-level intent classifier (pickle)
`lstm_tfidf.h5`	LSTM model trained on TF-IDF vectors
`lstm_bert.h5`	LSTM model trained on BERT embeddings
`tfidf_vectorizer.pkl`	TF-IDF vectorizer for preprocessing text
`tfidf_embeddings.pkl`	Cached TF-IDF embeddings for faster lookup
`bert_embeddings.npy`	Precomputed BERT embeddings used in training/testing
`lstm_accuracy_tfidf.png`	Evaluation plot (TF-IDF model)
`lstm_accuracy_bert.png`	Evaluation plot (BERT model)
`model_configs/`	JSON configs for training and architecture

🎯 Tasks Supported

Intent Classification: Understand what the user is trying to communicate.
Emotion Detection: Identify the emotional tone (e.g., sad, angry).
Embedding Generation: Support vector similarity or hybrid routing.

🔬 Model Overview

Model Type	Framework	Notes
LSTM + TF-IDF	Keras	Traditional pipeline with good generalization
LSTM + BERT	Keras	Handles contextual sentence meanings
TF-IDF + SVM	scikit-learn	Lightweight and interpretable intent routing
Sentence Classifier	scikit-learn	Quick rule or decision-tree model for sentence-level labels

🛠️ Usage Example

Intent Prediction (Joblib)

from joblib import load

clf = load("intent_clf.joblib")
text = ["I feel really anxious today"]
pred = clf.predict(text)

print("Intent:", pred[0])

LSTM Emotion Prediction

from tensorflow.keras.models import load_model
import numpy as np

model = load_model("lstm_bert.h5")
embeddings = np.load("bert_embeddings.npy")  # assuming aligned with test set
output = model.predict(embeddings)

print("Predicted emotion class:", output.argmax(axis=1))

📊 Evaluation

Model	Accuracy	Dataset Size	Notes
`lstm_bert.h5`	~88%	10,000+	Best for nuanced emotional input
`lstm_tfidf.h5`	~83%	10,000+	Lighter, faster
`intent_clf.joblib`	~90%	8,000+	Works well with short queries

Evaluation visualizations:

⚠️ Limitations

English only
May misclassify ambiguous or sarcastic phrases
LSTM models require matching vectorizer or embeddings

🧩 Integration

These models are invoked in:

app/chatbot/intent_classifier.py
app/chatbot/emotion.py
app/utils/embedding_search.py

🧠 Intended Use

Mental health journaling feedback
Chatbot-based emotion understanding
Offline fallback for heavy transformer models

📄 License

MIT License – free for commercial and research use.

Last updated: May 6, 2025