Training Details
Training Hyperparameters
defaults:
- _self_
dataset:
name: "StofEzz/dataset_c_voice0.2"
audio_sampling_rate: 16000
num_proc_preprocessing: 4
num_proc_dataset_map: 2
train: 80
test: 20
model:
name: "openai/whisper-tiny"
language: "french"
task: "transcribe"
text_preprocessing:
chars_to_ignore_regex: "[\\,\\?\\.\\!\\-\\;\\:\\ğ\\ź\\…\\ø\\ắ\\î\\´\\ŏ\\ę\\ź\\&\\'\\v\\ï\\ū\\ė\\ō\\ń\\ø\\…\\σ\\$\\ă\\ß\\ž\\ṯ\\ý\\ℵ\\đ\\ł\\ś\\ň\\ạ\\=\\_\\»\\ċ\\の\\\"\\ぬ\\ễ\\ż\\ć\\ů\\ʿ\\ș\\ı\\ñ\\(\\ò\\ř\\ä\\–\\ş\\«\\š\\ጠ\\°\\ℤ\\~\\\"\\ī\\ț\\č\\ả\\—\\)\\ā\\/\\½\"]"
training_args:
_target_: transformers.Seq2SeqTrainingArguments
output_dir: ./models
per_device_train_batch_size: 16
gradient_accumulation_steps: 1
learning_rate: 1e-5
warmup_steps: 500
max_steps: 6250
gradient_checkpointing: true
fp16: true
evaluation_strategy: "steps"
per_device_eval_batch_size: 8
predict_with_generate: true
generation_max_length: 225
save_steps: 2000
eval_steps: 100
logging_steps: 25
load_best_model_at_end: true
metric_for_best_model: "wer"
greater_is_better: false
push_to_hub: false
Metrics
WER: 0.46
- Downloads last month
- 38
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for smeoni/whisper-tiny-fr
Base model
openai/whisper-tiny