whisper-afrikaans-whisper_training_1756041540
This is a LoRA (Low-Rank Adaptation) adapter for openai/whisper-large-v3 fine-tuned on Afrikaans speech data.
Model Details
- Language: Afrikaans (af)
- Base Model: openai/whisper-large-v3
- Training Method: LoRA (Low-Rank Adaptation)
- Training Steps: 1000
- Hardware: gpu-t4
- Training Time: N/A hours
- LoRA Rank: 8
- LoRA Alpha: 32
Usage
This model requires the peft
library to load the LoRA adapter weights:
from transformers import WhisperProcessor, WhisperForConditionalGeneration
from peft import PeftModel
import torch
# Load base model and processor
base_model_name = "openai/whisper-large-v3"
processor = WhisperProcessor.from_pretrained(base_model_name)
base_model = WhisperForConditionalGeneration.from_pretrained(base_model_name)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "WernL/whisper-afrikaans-whisper_training_1756041540")
# Load audio
import librosa
audio, sr = librosa.load("path_to_audio.wav", sr=16000)
# Process
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
predicted_ids = model.generate(input_features)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
print(transcription[0])
Alternative: Direct Loading (if supported)
from transformers import pipeline
# This may work if the adapter is properly configured
pipe = pipeline("automatic-speech-recognition", model="WernL/whisper-afrikaans-whisper_training_1756041540")
result = pipe("path_to_audio.wav")
print(result["text"])
Training Configuration
- Dataset: common_voice_af_v1
- Batch Size: 16
- Learning Rate: 1e-05
- Max Steps: 1000
Performance
Final training metrics:
- WER: 0.089
- Loss: 0.177
This model was trained using the Whisper Training App.
- Downloads last month
- 11
Model tree for WernL/whisper-afrikaans-whisper_training_1756041540
Base model
openai/whisper-large-v3