Spaces:
Sleeping
Sleeping
Madverse Music: AI Audio Classifier - Usage Guide
Quick Start
Option 1: Hugging Face Space (Recommended)
Use our deployed model on Hugging Face Spaces:
Web Interface:
- Go to the Hugging Face Space URL
- Upload your audio file
- Click "Analyze Audio"
- Get instant results
API Access:
# Health check
curl https://your-space-name.hf.space/health
# Analyze audio file
curl -X POST "https://your-space-name.hf.space/analyze" \
-F "[email protected]"
Option 2: Local Setup
# Install dependencies
pip install -r requirements.txt
# Start the API server
python api.py
# Or start web interface
streamlit run app.py
Supported Audio Formats
- WAV (.wav)
- MP3 (.mp3)
- FLAC (.flac)
- M4A (.m4a)
- OGG (.ogg)
API Usage
Hugging Face Space API
Health Check
GET /health
Analyze Audio
POST /analyze
Upload audio file using multipart/form-data
Request: Upload file using form data with field name "file"
Response Format:
{
"classification": "Real",
"confidence": 0.85,
"probability": 0.15,
"raw_score": -1.73,
"duration": 30.5,
"message": "Detected as real music"
}
Usage Examples
Python
import requests
# Upload file to HF Space
with open('your_song.mp3', 'rb') as f:
response = requests.post('https://your-space-name.hf.space/analyze',
files={'file': f})
result = response.json()
print(result)
JavaScript
const formData = new FormData();
formData.append('file', fileInput.files[0]);
const response = await fetch('https://your-space-name.hf.space/analyze', {
method: 'POST',
body: formData
});
const result = await response.json();
Understanding Results
The classifier will output:
- "Real" = Human-created music
- "Fake" = AI-generated music (from Suno, Udio, etc.)
API Response Format:
{
"classification": "Real",
"confidence": 0.85,
"probability": 0.15,
"raw_score": -1.73,
"duration": 30.5,
"message": "Detected as real music"
}
Command Line Output:
Analyzing: my_song.wav
Result: Fake (AI-generated music)
Confidence: 0.96 | Raw output: 3.786
Model Specifications
- Model: SpecTTTra-α (120 seconds)
- Sample Rate: 16kHz
- Performance: 97% F1 score, 96% sensitivity, 99% specificity
- Max Duration: 120 seconds (2 minutes)
Technical Details
How It Works:
- Audio is loaded and resampled to 16kHz
- Converted to mel-spectrograms
- Processed by the SpecTTTra transformer model
- Output logit is converted to probability using sigmoid
- Classification:
prob < 0.5
= Real,prob ≥ 0.5
= Fake
Testing Your Music
- Get AI-generated samples: Download from Suno, Udio, or other AI music platforms
- Get real music samples: Use traditional human-created songs
- Run the classifier: Compare results to see how well it detects AI vs human music
Expected Performance
- High accuracy on detecting modern AI-generated music
- Works best with full songs (up to 120 seconds)
- Optimized for music from platforms like Suno and Udio
Note: This model was trained specifically for detecting AI-generated songs, not just AI vocals over real instrumentals. It analyzes the entire musical composition.