commited on
Commit
40b76ee
·
verified ·
1 Parent(s): 3e06e07

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +48 -1
  2. cipher_classifier.pkl +3 -0
README.md CHANGED
@@ -1,3 +1,50 @@
 
 
 
 
 
 
1
  ---
2
- license: mit
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🔐 Encrypted Text Classifier – 20 Newsgroups Cipher Challenge
2
+
3
+ This project is built for the [Kaggle Ciphertext Challenge](https://www.kaggle.com/competitions/20-newsgroups-ciphertext-challenge), where the goal is to classify encrypted text documents into 20 different newsgroup categories.
4
+
5
+ 🎯 Even without decrypting the text, we trained a character-level machine learning model that achieves over **63% accuracy**.
6
+
7
  ---
8
+
9
+ ## 📂 Project Structure
10
+ cipher-classifier/
11
+ ├── app.py # Streamlit app
12
+ ├── cipher_classifier.pkl # Pickled model + vectorizer
13
+ ├── train.csv # Kaggle training data
14
+ ├── requirements.txt # Libraries for deployment
15
+ └── README.md
16
+
17
+
18
  ---
19
+
20
+ ## 🧠 Model Overview
21
+
22
+ - **Input:** Ciphertext strings (unreadable encrypted text)
23
+ - **Vectorization:** `CountVectorizer` with char-level n-grams (1 to 3)
24
+ - **Model:** Logistic Regression (sklearn)
25
+ - **Accuracy:** ~63% (without decryption)
26
+
27
+ ---
28
+
29
+
30
+ Example Output
31
+ Input (Ciphertext) Predicted Label
32
+ ['W')(7x1zay7Hb3... 15
33
+ Tx4a8M\HNsyp;HM... 8
34
+
35
+
36
+
37
+ 📦 Deployment
38
+ This app is designed to run on:
39
+
40
+ 🟢 Hugging Face Spaces
41
+
42
+ 🟢 Streamlit Cloud
43
+
44
+ 🔵 GitHub
45
+
46
+
47
+ 📌 Kaggle Link
48
+ You can download the dataset from the official competition:
49
+ 👉 Kaggle – 20 Newsgroups Ciphertext Challenge
50
+
cipher_classifier.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f7d71b1fcda5760dbc2df3c4bb7de6ba87832e480b6a0ae337b9bd9bebc8aaf5
3
+ size 185929