File size: 2,715 Bytes
63f1f11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
# Truth Detection from Audio Stories

This model predicts whether a short audio story is truthful or deceptive using MFCC feature extraction and a Random Forest classifier.

## Model Details

* **Type:** Random Forest Classifier
* **Features:** 13-dimensional MFCC (Mel-Frequency Cepstral Coefficients)
* **Training Framework:** scikit-learn (`joblib` serialization)
* **Input:** WAV audio file
* **Output:** Predicted label: `True Story` or `Deceptive Story`

## Intended Uses & Limitations

**Intended Uses:**

* Detecting potential deception in short, spoken stories or statements.
* Research experiments on vocal biomarkers of deception.
* Educational demonstrations on audio feature extraction and classification.

**Limitations & Risks:**

* The model was trained on a limited dataset; performance may degrade on different languages, audio quality, or speaking styles.
* Predictions are probabilistic and should not be used as sole evidence in high-stakes scenarios (e.g., legal or security decisions).
* Cultural, linguistic, or demographic biases in the training data can lead to unfair predictions.

## Evaluation Metrics

* **Accuracy:** 91%
* **Languages in Training Data:** 15+ spoken languages

## Training Data

* **Source:** Curated dataset of narrated stories labeled as truthful or deceptive.
* **Preprocessing:** Resampled to original sampling rates, trimmed to 30 seconds, MFCC extraction.

## How to Use

### Installation

```bash
pip install -r requirements.txt
```

### Loading the Model in Python

```python
import joblib
from huggingface_hub import hf_hub_download

repo_id = "sangambhamare/TruthDetection"
model_file = hf_hub_download(repo_id=repo_id, filename="model.joblib")
model = joblib.load(model_file)
```

### Making Predictions

```python
import librosa
import numpy as np

def extract_mfcc(file_path):
    y, sr = librosa.load(file_path, sr=None)
    mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
    return np.mean(mfcc, axis=1)

features = extract_mfcc("path/to/audio.wav").reshape(1, -1)
prediction = model.predict(features)[0]
label = "True Story" if prediction == 1 else "Deceptive Story"
print(label)
```

## Gradio Demo

A live demo of this model is available via a Gradio interface. To launch locally:

```bash
python app.py
```

This will start a web app where you can upload a WAV file and see the prediction.

tag::end

---

## Citation

If you use this model in your research, please cite:

```
@misc{bhamare2025truthdetection,
  title={Truth Detection from Audio Stories},
  author={Sangam Sanjay Bhamare},
  year={2025},
  howpublished={\url{https://huggingface.co/sangambhamare/TruthDetection}}
}
```

## License

This model is released under the MIT License.