atdokmeci's picture
Create README.md
84f00ae verified

SMS Spam Detection: Combined Model Card

Models

1. Multinomial Naive Bayes

  • Type: MultinomialNB
  • Library: scikit-learn
  • Description: A Naive Bayes classifier for multinomially distributed data, commonly used for text classification tasks.
  • Training Data: SMS Spam Collection dataset (train.csv), preprocessed and vectorized using CountVectorizer.
  • Features: Bag-of-words (unigrams), stopwords removed.
  • Target: label (0: ham, 1: spam)
  • Accuracy: {{ accuracy_score(tahmin, y_test) }}
  • Date Trained: {{ datetime.now().strftime("%Y-%m-%d") }}

2. Decision Tree Classifier

  • Type: DecisionTreeClassifier
  • Library: scikit-learn
  • Description: A decision tree classifier for binary classification of SMS messages.
  • Training Data: SMS Spam Collection dataset (train.csv), preprocessed and vectorized using CountVectorizer.
  • Features: Bag-of-words (unigrams), stopwords removed.
  • Target: label (0: ham, 1: spam)
  • Accuracy: {{ accuracy_score(tahmin3, y_test) }}
  • Date Trained: {{ datetime.now().strftime("%Y-%m-%d") }}

Preprocessing

  • Lowercasing all text
  • Removing punctuation, digits, and newlines
  • Stopwords removed during vectorization

Evaluation Metric

  • Accuracy on test set

Notes

  • Models saved using joblib.
  • For further evaluation, consider precision, recall, and F1-score.