Spaces:
				
			
			
	
			
			
					
		Running
		
	
	
	
			
			
	
	
	
	
		
		
					
		Running
		
	
		jhj0517
		
	commited on
		
		
					Commit 
							
							·
						
						2109221
	
1
								Parent(s):
							
							d74c8ff
								
Add citation in README
Browse files- README.md +2 -3
- modules/uvr/music_separator.py +0 -1
    	
        README.md
    CHANGED
    
    | @@ -25,6 +25,7 @@ If you wish to try this on Colab, you can do it in [here](https://colab.research | |
| 25 | 
             
              - Translate subtitle files using Facebook NLLB models
         | 
| 26 | 
             
              - Translate subtitle files using DeepL API
         | 
| 27 | 
             
            - Pre-processing audio input with [Silero VAD](https://github.com/snakers4/silero-vad).
         | 
|  | |
| 28 | 
             
            - Post-processing with speaker diarization using the [pyannote](https://huggingface.co/pyannote/speaker-diarization-3.1) model.
         | 
| 29 | 
             
               - To download the pyannote model, you need to have a Huggingface token and manually accept their terms in the pages below.
         | 
| 30 | 
             
                  1. https://huggingface.co/pyannote/speaker-diarization-3.1
         | 
| @@ -109,8 +110,6 @@ This is Whisper's original VRAM usage table for models. | |
| 109 | 
             
            - [x] Integrate with faster-whisper
         | 
| 110 | 
             
            - [x] Integrate with insanely-fast-whisper
         | 
| 111 | 
             
            - [x] Integrate with whisperX ( Only speaker diarization part )
         | 
| 112 | 
            -
            - [ | 
| 113 | 
             
            - [ ] Add fast api script
         | 
| 114 | 
             
            - [ ] Support real-time transcription for microphone
         | 
| 115 | 
            -
             | 
| 116 | 
            -
             | 
|  | |
| 25 | 
             
              - Translate subtitle files using Facebook NLLB models
         | 
| 26 | 
             
              - Translate subtitle files using DeepL API
         | 
| 27 | 
             
            - Pre-processing audio input with [Silero VAD](https://github.com/snakers4/silero-vad).
         | 
| 28 | 
            +
            - Pre-processing audio input to separate BGM with [UVR](https://github.com/Anjok07/ultimatevocalremovergui), [UVR-api](https://github.com/NextAudioGen/ultimatevocalremover_api). 
         | 
| 29 | 
             
            - Post-processing with speaker diarization using the [pyannote](https://huggingface.co/pyannote/speaker-diarization-3.1) model.
         | 
| 30 | 
             
               - To download the pyannote model, you need to have a Huggingface token and manually accept their terms in the pages below.
         | 
| 31 | 
             
                  1. https://huggingface.co/pyannote/speaker-diarization-3.1
         | 
|  | |
| 110 | 
             
            - [x] Integrate with faster-whisper
         | 
| 111 | 
             
            - [x] Integrate with insanely-fast-whisper
         | 
| 112 | 
             
            - [x] Integrate with whisperX ( Only speaker diarization part )
         | 
| 113 | 
            +
            - [x] Add background music separation pre-processing with [UVR](https://github.com/Anjok07/ultimatevocalremovergui)  
         | 
| 114 | 
             
            - [ ] Add fast api script
         | 
| 115 | 
             
            - [ ] Support real-time transcription for microphone
         | 
|  | |
|  | 
    	
        modules/uvr/music_separator.py
    CHANGED
    
    | @@ -1,4 +1,3 @@ | |
| 1 | 
            -
            # Credit to Team UVR : https://github.com/Anjok07/ultimatevocalremovergui
         | 
| 2 | 
             
            from typing import Optional, Union
         | 
| 3 | 
             
            import numpy as np
         | 
| 4 | 
             
            import torchaudio
         | 
|  | |
|  | |
| 1 | 
             
            from typing import Optional, Union
         | 
| 2 | 
             
            import numpy as np
         | 
| 3 | 
             
            import torchaudio
         |