Commit
Β·
33715d3
1
Parent(s):
34bd12a
Readme
Browse files
README.md
CHANGED
|
@@ -1,157 +1,14 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
## Models Used
|
| 16 |
-
|
| 17 |
-
This project uses pre-trained models hosted on Hugging Face Hub:
|
| 18 |
-
|
| 19 |
-
### Speech Recognition Model
|
| 20 |
-
**Wav2Vec 2.0 - English**
|
| 21 |
-
- **Model:** `facebook/wav2vec2-large-960h-lv60-self`
|
| 22 |
-
- **Link:** [https://huggingface.co/facebook/wav2vec2-large-960h-lv60-self](https://huggingface.co/facebook/wav2vec2-large-960h-lv60-self)
|
| 23 |
-
- **Description:** Large Wav2Vec 2.0 model trained on 960 hours of English LibriSpeech data
|
| 24 |
-
- **Use:** Audio-to-text transcription
|
| 25 |
-
|
| 26 |
-
### Sentiment Analysis Model
|
| 27 |
-
**BERT - Multilingual Sentiment**
|
| 28 |
-
- **Model:** `nlptown/bert-base-multilingual-uncased-sentiment`
|
| 29 |
-
- **Link:** [https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment](https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment)
|
| 30 |
-
- **Description:** Multilingual BERT model fine-tuned for sentiment analysis (1-5 stars)
|
| 31 |
-
- **Use:** Text sentiment classification
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
## Project Structure
|
| 35 |
-
|
| 36 |
-
```
|
| 37 |
-
voice-sentiment-project/
|
| 38 |
-
βββ requirements.txt # Dependencies
|
| 39 |
-
βββ voice_sentiment.py # Core analyzer class
|
| 40 |
-
βββ api.py # REST API Server
|
| 41 |
-
βββ app.py # Gradio web interface
|
| 42 |
-
βββ main.py # CLI interface
|
| 43 |
-
βββ utils.py # Utility functions and CSS styling
|
| 44 |
-
βββ audios/ # Your audio files
|
| 45 |
-
β βββ call1.wav
|
| 46 |
-
β βββ call2.mp3
|
| 47 |
-
β βββ ...
|
| 48 |
-
βββ analysis_results.csv # Generated results
|
| 49 |
-
```
|
| 50 |
-
## Language Support
|
| 51 |
-
|
| 52 |
-
### Current Model: English Only
|
| 53 |
-
This system is currently configured with an English-only Wav2Vec 2.0 model (`facebook/wav2vec2-large-960h-lv60-self`) for optimal English speech recognition performance.
|
| 54 |
-
|
| 55 |
-
### For Other Languages
|
| 56 |
-
To use this system with other languages, you need to change the Wav2Vec 2.0 model in `voice_sentiment.py`.
|
| 57 |
-
|
| 58 |
-
## Quick Installation
|
| 59 |
-
|
| 60 |
-
```bash
|
| 61 |
-
pip install -r requirements.txt
|
| 62 |
-
```
|
| 63 |
-
|
| 64 |
-
## Usage
|
| 65 |
-
|
| 66 |
-
### 1. Web Interface (Recommended)
|
| 67 |
-
|
| 68 |
-
```bash
|
| 69 |
-
python app.py
|
| 70 |
-
```
|
| 71 |
-
|
| 72 |
-
Opens a web browser interface at `http://localhost:7860`
|
| 73 |
-
|
| 74 |
-
### 2. Command Line Interface
|
| 75 |
-
|
| 76 |
-
```bash
|
| 77 |
-
python main.py
|
| 78 |
-
```
|
| 79 |
-
|
| 80 |
-
### 3. Direct Code Usage
|
| 81 |
-
|
| 82 |
-
```python
|
| 83 |
-
from voice_sentiment import VoiceSentimentAnalyzer
|
| 84 |
-
|
| 85 |
-
# Initialize
|
| 86 |
-
analyzer = VoiceSentimentAnalyzer()
|
| 87 |
-
|
| 88 |
-
# Analyze one call
|
| 89 |
-
result = analyzer.analyze_call("call1.wav")
|
| 90 |
-
print(result)
|
| 91 |
-
|
| 92 |
-
# Analyze multiple calls
|
| 93 |
-
results = analyzer.analyze_batch("audios/")
|
| 94 |
-
```
|
| 95 |
-
|
| 96 |
-
## Example Output
|
| 97 |
-
|
| 98 |
-
```python
|
| 99 |
-
{
|
| 100 |
-
'file': 'call1.wav',
|
| 101 |
-
'transcription': 'Hello I am very satisfied with your service',
|
| 102 |
-
'sentiment': 'POSITIVE',
|
| 103 |
-
'score': 0.89,
|
| 104 |
-
'satisfaction': 'Satisfied'
|
| 105 |
-
}
|
| 106 |
-
```
|
| 107 |
-
|
| 108 |
-
## Simple Workflow
|
| 109 |
-
|
| 110 |
-
```
|
| 111 |
-
Audio File β Transcription (Wav2Vec2) β Sentiment (BERT) β Classification
|
| 112 |
-
```
|
| 113 |
-
|
| 114 |
-
Perfect for analyzing customer call sentiment quickly and easily!
|
| 115 |
-
|
| 116 |
-
## Supported Audio Formats
|
| 117 |
-
|
| 118 |
-
### **Fully Supported**
|
| 119 |
-
- **WAV** (.wav) - *Recommended for best quality*
|
| 120 |
-
- **MP3** (.mp3) - *Most common format*
|
| 121 |
-
- **M4A** (.m4a) - *Apple audio format*
|
| 122 |
-
|
| 123 |
-
### **Audio Specifications**
|
| 124 |
-
- **Sample Rate**: Automatically converted to 16kHz
|
| 125 |
-
- **Channels**: Mono or Stereo (converted to mono)
|
| 126 |
-
- **Duration**: 5 seconds to 10 minutes (optimal: 30 seconds - 2 minutes)
|
| 127 |
-
- **Quality**: Clear speech, minimal background noise recommended
|
| 128 |
-
|
| 129 |
-
### **Not Supported**
|
| 130 |
-
- Video files (MP4, AVI, MOV, etc.)
|
| 131 |
-
- Other audio formats (FLAC, OGG, etc.) - *may work but not guaranteed*
|
| 132 |
-
- Extremely low quality or heavily distorted audio
|
| 133 |
-
- Files with encryption or DRM protection
|
| 134 |
-
|
| 135 |
-
### **Audio Quality Tips**
|
| 136 |
-
- Use WAV format for highest accuracy
|
| 137 |
-
- Ensure clear speech recording
|
| 138 |
-
- Minimize background noise
|
| 139 |
-
- Optimal recording: 16kHz, 16-bit, mono
|
| 140 |
-
- Test with short samples first
|
| 141 |
-
|
| 142 |
-
## CSV Output & Results
|
| 143 |
-
|
| 144 |
-
### **Automatic CSV Generation**
|
| 145 |
-
When using batch analysis (multiple files), the system automatically generates a detailed CSV file with all results.
|
| 146 |
-
|
| 147 |
-
**File**: `analysis_results.csv`
|
| 148 |
-
|
| 149 |
-
**Location**: Same folder as the project
|
| 150 |
-
|
| 151 |
-
### **CSV Contents**
|
| 152 |
-
```csv
|
| 153 |
-
File,Transcription,Sentiment,Score,Satisfaction
|
| 154 |
-
call1.wav,"Hello I am very satisfied with your service",POSITIVE,0.89,Satisfied
|
| 155 |
-
call2.wav,"This is unacceptable I want a refund",NEGATIVE,0.92,Dissatisfied
|
| 156 |
-
call3.wav,"Can you tell me about your pricing",NEUTRAL,0.65,Neutral
|
| 157 |
-
```
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Voice Sentiment Analysis
|
| 3 |
+
emoji: π₯
|
| 4 |
+
colorFrom: red
|
| 5 |
+
colorTo: gray
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: 5.37.0
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
+
short_description: This project is an automated solution for analyzing customer
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|