Arabic Wav2Vec2 Model (SafeTensor Format)
This model is a SafeTensor conversion of kmfoda/wav2vec2-large-xlsr-arabic.
Model Description
This is a fine-tuned version of Facebook's Wav2Vec2 large model on Arabic speech data. The model has been converted to SafeTensor format for improved loading speed and security.
Usage
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
import torch
import librosa
# Load model and processor
processor = Wav2Vec2Processor.from_pretrained("ABDALLALSWAITI/wav2vec2-large-xlsr-arabic")
model = Wav2Vec2ForCTC.from_pretrained("ABDALLALSWAITI/wav2vec2-large-xlsr-arabic")
# Load audio
audio_input, sample_rate = librosa.load("path_to_audio.wav", sr=16000)
# Process audio
input_values = processor(audio_input, sampling_rate=16000, return_tensors="pt").input_values
# Get predictions
with torch.no_grad():
logits = model(input_values).logits
# Decode predictions
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.decode(predicted_ids[0])
print(transcription)
Training Data
Please refer to the original model kmfoda/wav2vec2-large-xlsr-arabic for training data information.
Limitations and Bias
Please refer to the original model for information about limitations and potential biases.
Original Model
- Original Model: kmfoda/wav2vec2-large-xlsr-arabic
- Conversion: Converted to SafeTensor format for improved performance and security
- Downloads last month
- 7
Model tree for ABDALLALSWAITI/wav2vec2-large-xlsr-arabic
Base model
kmfoda/wav2vec2-large-xlsr-arabic