sangambhamare commited on
Commit
63f1f11
·
verified ·
1 Parent(s): 768c4a8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -3
README.md CHANGED
@@ -1,3 +1,102 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Truth Detection from Audio Stories
2
+
3
+ This model predicts whether a short audio story is truthful or deceptive using MFCC feature extraction and a Random Forest classifier.
4
+
5
+ ## Model Details
6
+
7
+ * **Type:** Random Forest Classifier
8
+ * **Features:** 13-dimensional MFCC (Mel-Frequency Cepstral Coefficients)
9
+ * **Training Framework:** scikit-learn (`joblib` serialization)
10
+ * **Input:** WAV audio file
11
+ * **Output:** Predicted label: `True Story` or `Deceptive Story`
12
+
13
+ ## Intended Uses & Limitations
14
+
15
+ **Intended Uses:**
16
+
17
+ * Detecting potential deception in short, spoken stories or statements.
18
+ * Research experiments on vocal biomarkers of deception.
19
+ * Educational demonstrations on audio feature extraction and classification.
20
+
21
+ **Limitations & Risks:**
22
+
23
+ * The model was trained on a limited dataset; performance may degrade on different languages, audio quality, or speaking styles.
24
+ * Predictions are probabilistic and should not be used as sole evidence in high-stakes scenarios (e.g., legal or security decisions).
25
+ * Cultural, linguistic, or demographic biases in the training data can lead to unfair predictions.
26
+
27
+ ## Evaluation Metrics
28
+
29
+ * **Accuracy:** 91%
30
+ * **Languages in Training Data:** 15+ spoken languages
31
+
32
+ ## Training Data
33
+
34
+ * **Source:** Curated dataset of narrated stories labeled as truthful or deceptive.
35
+ * **Preprocessing:** Resampled to original sampling rates, trimmed to 30 seconds, MFCC extraction.
36
+
37
+ ## How to Use
38
+
39
+ ### Installation
40
+
41
+ ```bash
42
+ pip install -r requirements.txt
43
+ ```
44
+
45
+ ### Loading the Model in Python
46
+
47
+ ```python
48
+ import joblib
49
+ from huggingface_hub import hf_hub_download
50
+
51
+ repo_id = "sangambhamare/TruthDetection"
52
+ model_file = hf_hub_download(repo_id=repo_id, filename="model.joblib")
53
+ model = joblib.load(model_file)
54
+ ```
55
+
56
+ ### Making Predictions
57
+
58
+ ```python
59
+ import librosa
60
+ import numpy as np
61
+
62
+ def extract_mfcc(file_path):
63
+ y, sr = librosa.load(file_path, sr=None)
64
+ mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
65
+ return np.mean(mfcc, axis=1)
66
+
67
+ features = extract_mfcc("path/to/audio.wav").reshape(1, -1)
68
+ prediction = model.predict(features)[0]
69
+ label = "True Story" if prediction == 1 else "Deceptive Story"
70
+ print(label)
71
+ ```
72
+
73
+ ## Gradio Demo
74
+
75
+ A live demo of this model is available via a Gradio interface. To launch locally:
76
+
77
+ ```bash
78
+ python app.py
79
+ ```
80
+
81
+ This will start a web app where you can upload a WAV file and see the prediction.
82
+
83
+ tag::end
84
+
85
+ ---
86
+
87
+ ## Citation
88
+
89
+ If you use this model in your research, please cite:
90
+
91
+ ```
92
+ @misc{bhamare2025truthdetection,
93
+ title={Truth Detection from Audio Stories},
94
+ author={Sangam Sanjay Bhamare},
95
+ year={2025},
96
+ howpublished={\url{https://huggingface.co/sangambhamare/TruthDetection}}
97
+ }
98
+ ```
99
+
100
+ ## License
101
+
102
+ This model is released under the MIT License.