Truth Detection from Audio Stories

This model predicts whether a short audio story is truthful or deceptive using MFCC feature extraction and a Random Forest classifier.

Model Details

Type: Random Forest Classifier
Features: 13-dimensional MFCC (Mel-Frequency Cepstral Coefficients)
Training Framework: scikit-learn (joblib serialization)
Input: WAV audio file
Output: Predicted label: True Story or Deceptive Story

Intended Uses & Limitations

Intended Uses:

Detecting potential deception in short, spoken stories or statements.
Research experiments on vocal biomarkers of deception.
Educational demonstrations on audio feature extraction and classification.

Limitations & Risks:

The model was trained on a limited dataset; performance may degrade on different languages, audio quality, or speaking styles.
Predictions are probabilistic and should not be used as sole evidence in high-stakes scenarios (e.g., legal or security decisions).
Cultural, linguistic, or demographic biases in the training data can lead to unfair predictions.

Evaluation Metrics

Accuracy: 91%
Languages in Training Data: 15+ spoken languages

Training Data

Source: Curated dataset of narrated stories labeled as truthful or deceptive.
Preprocessing: Resampled to original sampling rates, trimmed to 30 seconds, MFCC extraction.

How to Use

Installation

pip install -r requirements.txt

Loading the Model in Python

import joblib
from huggingface_hub import hf_hub_download

repo_id = "sangambhamare/TruthDetection"
model_file = hf_hub_download(repo_id=repo_id, filename="model.joblib")
model = joblib.load(model_file)

Making Predictions

import librosa
import numpy as np

def extract_mfcc(file_path):
    y, sr = librosa.load(file_path, sr=None)
    mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
    return np.mean(mfcc, axis=1)

features = extract_mfcc("path/to/audio.wav").reshape(1, -1)
prediction = model.predict(features)[0]
label = "True Story" if prediction == 1 else "Deceptive Story"
print(label)

Gradio Demo

A live demo of this model is available via a Gradio interface. To launch locally:

python app.py

This will start a web app where you can upload a WAV file and see the prediction.

tag::end

Citation

If you use this model in your research, please cite:

@misc{bhamare2025truthdetection,
  title={Truth Detection from Audio Stories},
  author={Sangam Sanjay Bhamare},
  year={2025},
  howpublished={\url{https://huggingface.co/sangambhamare/TruthDetection}}
}

License

This model is released under the MIT License.

sangambhamare
/

TruthDetection