Upload 6 files

Browse files

Files changed (6) hide show

README.md +79 -0
config.json +280 -0
model.bin +3 -0
preprocessor_config.json +14 -0
tokenizer.json +0 -0
vocabulary.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+language:
+  - en
+tags:
+  - audio
+  - automatic-speech-recognition
+license: mit
+library_name: ctranslate2
+---
+# Distil-Whisper: Distil-Large-v3.5 for CTranslate2
+This repository contains the model weights for [distil-large-v3.5](https://huggingface.co/distil-whisper/distil-large-v3.5)
+converted to [CTranslate2](https://github.com/OpenNMT/CTranslate2) format. CTranslate2 is a fast inference engine for
+Transformer models and is the supported backend for the [Faster-Whisper](https://github.com/systran/faster-whisper) package.
+## Usage
+To use the model in Faster-Whisper, first install the PyPi package according to the [official instructions](https://github.com/SYSTRAN/faster-whisper#installation).
+For this example, we'll also install 🤗 Datasets to load a toy audio dataset from the Hugging Face Hub:
+```bash
+pip install --upgrade pip
+pip install --upgrade git+https://github.com/SYSTRAN/faster-whisper datasets[audio]
+```
+The following code snippet loads the distil-large-v3 model and runs inference on an example file from the LibriSpeech ASR
+dataset:
+```python
+import torch
+from faster_whisper import WhisperModel
+from datasets import load_dataset
+# define our torch configuration
+device = "cuda" if torch.cuda.is_available() else "cpu"
+compute_type = "float16" if torch.cuda.is_available() else "float32"
+# load model on GPU if available, else cpu
+model = WhisperModel("distil-whisper/distil-large-v3.5-ct2", device=device, compute_type=compute_type)
+# load toy dataset for example
+dataset = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
+sample = dataset[1]["audio"]["path"]
+segments, info = model.transcribe(sample, beam_size=5, language="en")
+for segment in segments:
+    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
+```
+To transcribe a local audio file, simply pass the path to the audio file as the `audio` argument to transcribe:
+```python
+segments, info = model.transcribe("audio.mp3", beam_size=5, language="en")
+```
+## Model Details
+For more information about the Distil-Large-v3.5 model, refer to the original [model card](https://huggingface.co/distil-whisper/distil-large-v3.5).
+## License
+Distil-Whisper inherits the [MIT license](https://github.com/huggingface/distil-whisper/blob/main/LICENSE) from OpenAI's Whisper model.
+## Citation
+If you use this model, please consider citing the [Distil-Whisper paper](https://arxiv.org/abs/2311.00430):
+```
+@misc{gandhi2023distilwhisper,
+      title={Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling},
+      author={Sanchit Gandhi and Patrick von Platen and Alexander M. Rush},
+      year={2023},
+      eprint={2311.00430},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```

config.json ADDED Viewed

	@@ -0,0 +1,280 @@

+{
+  "alignment_heads": [
+    [
+      1,
+      0
+    ],
+    [
+      1,
+      1
+    ],
+    [
+      1,
+      2
+    ],
+    [
+      1,
+      3
+    ],
+    [
+      1,
+      4
+    ],
+    [
+      1,
+      5
+    ],
+    [
+      1,
+      6
+    ],
+    [
+      1,
+      7
+    ],
+    [
+      1,
+      8
+    ],
+    [
+      1,
+      9
+    ],
+    [
+      1,
+      10
+    ],
+    [
+      1,
+      11
+    ],
+    [
+      1,
+      12
+    ],
+    [
+      1,
+      13
+    ],
+    [
+      1,
+      14
+    ],
+    [
+      1,
+      15
+    ],
+    [
+      1,
+      16
+    ],
+    [
+      1,
+      17
+    ],
+    [
+      1,
+      18
+    ],
+    [
+      1,
+      19
+    ]
+  ],
+  "lang_ids": [
+    50259,
+    50260,
+    50261,
+    50262,
+    50263,
+    50264,
+    50265,
+    50266,
+    50267,
+    50268,
+    50269,
+    50270,
+    50271,
+    50272,
+    50273,
+    50274,
+    50275,
+    50276,
+    50277,
+    50278,
+    50279,
+    50280,
+    50281,
+    50282,
+    50283,
+    50284,
+    50285,
+    50286,
+    50287,
+    50288,
+    50289,
+    50290,
+    50291,
+    50292,
+    50293,
+    50294,
+    50295,
+    50296,
+    50297,
+    50298,
+    50299,
+    50300,
+    50301,
+    50302,
+    50303,
+    50304,
+    50305,
+    50306,
+    50307,
+    50308,
+    50309,
+    50310,
+    50311,
+    50312,
+    50313,
+    50314,
+    50315,
+    50316,
+    50317,
+    50318,
+    50319,
+    50320,
+    50321,
+    50322,
+    50323,
+    50324,
+    50325,
+    50326,
+    50327,
+    50328,
+    50329,
+    50330,
+    50331,
+    50332,
+    50333,
+    50334,
+    50335,
+    50336,
+    50337,
+    50338,
+    50339,
+    50340,
+    50341,
+    50342,
+    50343,
+    50344,
+    50345,
+    50346,
+    50347,
+    50348,
+    50349,
+    50350,
+    50351,
+    50352,
+    50353,
+    50354,
+    50355,
+    50356,
+    50357,
+    50358
+  ],
+  "suppress_ids": [
+    1,
+    2,
+    7,
+    8,
+    9,
+    10,
+    14,
+    25,
+    26,
+    27,
+    28,
+    29,
+    31,
+    58,
+    59,
+    60,
+    61,
+    62,
+    63,
+    90,
+    91,
+    92,
+    93,
+    359,
+    503,
+    522,
+    542,
+    873,
+    893,
+    902,
+    918,
+    922,
+    931,
+    1350,
+    1853,
+    1982,
+    2460,
+    2627,
+    3246,
+    3253,
+    3268,
+    3536,
+    3846,
+    3961,
+    4183,
+    4667,
+    6585,
+    6647,
+    7273,
+    9061,
+    9383,
+    10428,
+    10929,
+    11938,
+    12033,
+    12331,
+    12562,
+    13793,
+    14157,
+    14635,
+    15265,
+    15618,
+    16553,
+    16604,
+    18362,
+    18956,
+    20075,
+    21675,
+    22520,
+    26130,
+    26161,
+    26435,
+    28279,
+    29464,
+    31650,
+    32302,
+    32470,
+    36865,
+    42863,
+    47425,
+    49870,
+    50254,
+    50258,
+    50359,
+    50360,
+    50361,
+    50362,
+    50363
+  ],
+  "suppress_ids_begin": [
+    220,
+    50257
+  ]
+}

model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c58b88b8585ffcd2135fddaaf421ce72cb223b32edea70d156aed1dea319a119
+size 1512927867

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,14 @@

+{
+  "chunk_length": 30,
+  "feature_extractor_type": "WhisperFeatureExtractor",
+  "feature_size": 128,
+  "hop_length": 160,
+  "n_fft": 400,
+  "n_samples": 480000,
+  "nb_max_frames": 3000,
+  "padding_side": "right",
+  "padding_value": 0.0,
+  "processor_class": "WhisperProcessor",
+  "return_attention_mask": false,
+  "sampling_rate": 16000
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

vocabulary.json ADDED Viewed

The diff for this file is too large to render. See raw diff