fznx92
/

openai-whisper-large-v2-ja-transcribe-colab

Model card Files Files and versions Community

fznx92 commited on Dec 30, 2023

Commit

179a858

·

1 Parent(s): 2fc3736

Update README.md

Files changed (1) hide show

README.md +36 -2

README.md CHANGED Viewed

@@ -1,6 +1,12 @@
 ---
 library_name: peft
 base_model: openai/whisper-large-v2
 ---
 # Model Card for Model ID
@@ -25,10 +31,38 @@ openai-whisper-large-v2-LORA-ja
 ## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ### Training Data

 ---
 library_name: peft
 base_model: openai/whisper-large-v2
+datasets:
+- mozilla-foundation/common_voice_16_0
+language:
+- ja
+metrics:
+- wer
 ---
 # Model Card for Model ID
 ## How to Get Started with the Model
+import torch
+from transformers import (
+    AutomaticSpeechRecognitionPipeline,
+    WhisperForConditionalGeneration,
+    WhisperTokenizer,
+    WhisperProcessor,
+)
+from peft import PeftModel, PeftConfig
+peft_model_id = "fznx92/openai-whisper-large-v2-ja-transcribe-colab"
+sample = "insert mp3 file location here"
+language = "japanese"
+task = "transcribe"
+peft_config = PeftConfig.from_pretrained(peft_model_id)
+model = WhisperForConditionalGeneration.from_pretrained(
+    peft_config.base_model_name_or_path,
+)
+model = PeftModel.from_pretrained(model, peft_model_id)
+model.to("cuda").half()
+processor = WhisperProcessor.from_pretrained(peft_config.base_model_name_or_path, language=language, task=task)
+pipe = AutomaticSpeechRecognitionPipeline(model=model, tokenizer=processor.tokenizer, feature_extractor=processor.feature_extractor, batch_size=8, torch_dtype=torch.float16, device="cuda:0")
+def transcribe(audio, return_timestamps=False):
+    text = pipe(audio, chunk_length_s=30, return_timestamps=return_timestamps, generate_kwargs={"language": language, "task": task})["text"]
+    return text
+transcript = transcribe(sample)
+print(transcript)
 ### Training Data