Arabic Wav2Vec2 Model (SafeTensor Format)

This model is a SafeTensor conversion of kmfoda/wav2vec2-large-xlsr-arabic.

Model Description

This is a fine-tuned version of Facebook's Wav2Vec2 large model on Arabic speech data. The model has been converted to SafeTensor format for improved loading speed and security.

Usage

from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
import torch
import librosa

# Load model and processor
processor = Wav2Vec2Processor.from_pretrained("ABDALLALSWAITI/wav2vec2-large-xlsr-arabic")
model = Wav2Vec2ForCTC.from_pretrained("ABDALLALSWAITI/wav2vec2-large-xlsr-arabic")

# Load audio
audio_input, sample_rate = librosa.load("path_to_audio.wav", sr=16000)

# Process audio
input_values = processor(audio_input, sampling_rate=16000, return_tensors="pt").input_values

# Get predictions
with torch.no_grad():
    logits = model(input_values).logits

# Decode predictions
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.decode(predicted_ids[0])

print(transcription)

Training Data

Please refer to the original model kmfoda/wav2vec2-large-xlsr-arabic for training data information.

Limitations and Bias

Please refer to the original model for information about limitations and potential biases.

Original Model

Downloads last month
7
Safetensors
Model size
316M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ABDALLALSWAITI/wav2vec2-large-xlsr-arabic

Finetuned
(1)
this model