Create README.md
84f00ae
verified
SMS Spam Detection: Combined Model Card
Models
1. Multinomial Naive Bayes
- Type: MultinomialNB
- Library: scikit-learn
- Description: A Naive Bayes classifier for multinomially distributed data, commonly used for text classification tasks.
- Training Data: SMS Spam Collection dataset (
train.csv
), preprocessed and vectorized using CountVectorizer.
- Features: Bag-of-words (unigrams), stopwords removed.
- Target:
label
(0: ham, 1: spam)
- Accuracy:
{{ accuracy_score(tahmin, y_test) }}
- Date Trained:
{{ datetime.now().strftime("%Y-%m-%d") }}
2. Decision Tree Classifier
- Type: DecisionTreeClassifier
- Library: scikit-learn
- Description: A decision tree classifier for binary classification of SMS messages.
- Training Data: SMS Spam Collection dataset (
train.csv
), preprocessed and vectorized using CountVectorizer.
- Features: Bag-of-words (unigrams), stopwords removed.
- Target:
label
(0: ham, 1: spam)
- Accuracy:
{{ accuracy_score(tahmin3, y_test) }}
- Date Trained:
{{ datetime.now().strftime("%Y-%m-%d") }}
Preprocessing
- Lowercasing all text
- Removing punctuation, digits, and newlines
- Stopwords removed during vectorization
Evaluation Metric
Notes
- Models saved using joblib.
- For further evaluation, consider precision, recall, and F1-score.