YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Truth Detection from Audio Stories

This model predicts whether a short audio story is truthful or deceptive using MFCC feature extraction and a Random Forest classifier.

Model Details

  • Type: Random Forest Classifier
  • Features: 13-dimensional MFCC (Mel-Frequency Cepstral Coefficients)
  • Training Framework: scikit-learn (joblib serialization)
  • Input: WAV audio file
  • Output: Predicted label: True Story or Deceptive Story

Intended Uses & Limitations

Intended Uses:

  • Detecting potential deception in short, spoken stories or statements.
  • Research experiments on vocal biomarkers of deception.
  • Educational demonstrations on audio feature extraction and classification.

Limitations & Risks:

  • The model was trained on a limited dataset; performance may degrade on different languages, audio quality, or speaking styles.
  • Predictions are probabilistic and should not be used as sole evidence in high-stakes scenarios (e.g., legal or security decisions).
  • Cultural, linguistic, or demographic biases in the training data can lead to unfair predictions.

Evaluation Metrics

  • Accuracy: 91%
  • Languages in Training Data: 15+ spoken languages

Training Data

  • Source: Curated dataset of narrated stories labeled as truthful or deceptive.
  • Preprocessing: Resampled to original sampling rates, trimmed to 30 seconds, MFCC extraction.

How to Use

Installation

pip install -r requirements.txt

Loading the Model in Python

import joblib
from huggingface_hub import hf_hub_download

repo_id = "sangambhamare/TruthDetection"
model_file = hf_hub_download(repo_id=repo_id, filename="model.joblib")
model = joblib.load(model_file)

Making Predictions

import librosa
import numpy as np

def extract_mfcc(file_path):
    y, sr = librosa.load(file_path, sr=None)
    mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
    return np.mean(mfcc, axis=1)

features = extract_mfcc("path/to/audio.wav").reshape(1, -1)
prediction = model.predict(features)[0]
label = "True Story" if prediction == 1 else "Deceptive Story"
print(label)

Gradio Demo

A live demo of this model is available via a Gradio interface. To launch locally:

python app.py

This will start a web app where you can upload a WAV file and see the prediction.

tag::end


Citation

If you use this model in your research, please cite:

@misc{bhamare2025truthdetection,
  title={Truth Detection from Audio Stories},
  author={Sangam Sanjay Bhamare},
  year={2025},
  howpublished={\url{https://huggingface.co/sangambhamare/TruthDetection}}
}

License

This model is released under the MIT License.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using sangambhamare/TruthDetection 1