Da4ThEdge's picture
Update README.md
54b6023 verified
metadata
library_name: peft
datasets:
  - mozilla-foundation/common_voice_17_0
language:
  - bn
base_model:
  - openai/whisper-base
license: apache-2.0
metrics:
  - wer
pipeline_tag: automatic-speech-recognition
model-index:
  - name: Whisper Base Bn LoRA Adapter - BanglaBridge
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 17.0
          type: mozilla-foundation/common_voice_17_0
          config: bn
          split: None
          args: 'config: bn, split: test'
        metrics:
          - name: Wer
            type: wer
            value: 22.56397

Whisper Base Bn LoRA Adapter - by BanglaBridge

This model is a PEFT LoRA fine-tuned version of openai/whisper-base on the Common Voice 17.0 dataset. It achieves the following results on the test set:

  • Wer: 44.93734
  • Normalized Wer: 22.56397

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-03
  • train_batch_size: 32
  • eval_batch_size: 32
  • warmup_steps: 500
  • training_steps: 20000

LoraConfig:

  • r: 32
  • lora_alpha: 64
  • target_modules: ["q_proj", "v_proj"]
  • lora_dropout: 0.005
  • bias: none

Training results

Step Training Loss Validation Loss
1000 0.240200 0.251211
2000 0.178700 0.210411
3000 0.150000 0.193197
4000 0.122500 0.184060
5000 0.122300 0.177079
6000 0.097100 0.181073
7000 0.095800 0.175566
8000 0.071400 0.173997
9000 0.082600 0.175677
10000 0.064400 0.178262
11000 0.064700 0.177943
12000 0.046900 0.185763
13000 0.047200 0.186843
14000 0.037500 0.193575
15000 0.036000 0.199084
16000 0.027500 0.208745
17000 0.025200 0.215685
18000 0.017400 0.227938
19000 0.016500 0.236160
20000 0.013000 0.240447

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.1
  • Tokenizers 0.19.1
  • Peft 0.10.0