File size: 4,240 Bytes
0767a3a 233a03e 0767a3a 233a03e 0767a3a 233a03e 0767a3a 3c81799 0767a3a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
---
license: apache-2.0
quantized_by: Pomni
language:
- en
- zh
- de
- es
- ru
- ko
- fr
- ja
- pt
- tr
- pl
- ca
- nl
- ar
- sv
- it
- id
- hi
- fi
- vi
- he
- uk
- el
- ms
- cs
- ro
- da
- hu
- ta
- 'no'
- th
- ur
- hr
- bg
- lt
- la
- mi
- ml
- cy
- sk
- te
- fa
- lv
- bn
- sr
- az
- sl
- kn
- et
- mk
- br
- eu
- is
- hy
- ne
- mn
- bs
- kk
- sq
- sw
- gl
- mr
- pa
- si
- km
- sn
- yo
- so
- af
- oc
- ka
- be
- tg
- sd
- gu
- am
- yi
- lo
- uz
- fo
- ht
- ps
- tk
- nn
- mt
- sa
- lb
- my
- bo
- tl
- mg
- as
- tt
- haw
- ln
- ha
- ba
- jw
- su
base_model:
- openai/whisper-large-v2
pipeline_tag: automatic-speech-recognition
tags:
- whisper.cpp
- ggml
- whisper
- audio
- speech
- voice
new_version: Pomni/whisper-large-v3-ggml-allquants
---
# Whisper-Large-v2 quants
This is a repository of **GGML quants for [whisper-large-v2](https://huggingface.co/openai/whisper-large-v2)**, for use with [whisper.cpp](https://github.com/ggml-org/whisper.cpp).
If you are looking for a program to run this model with, then I would recommend [EasyWhisper UI](https://github.com/mehtabmahir/easy-whisper-ui), as it is user-friendly, has a GUI, and will automate a lot of the hard stuff for you.
## List of Quants
Clicking on a link will download the corresponding quant instantly.
| Link | Quant | Size | Notes
|:-----|:-----|--------:|:------|
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-f32.bin) | F32 | 6.17 GB | Likely overkill. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-f16.bin) | F16 | 3.09 GB | Performs better than Q8_0 for noisy audio and music. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q8_0.bin) | Q8_0 | 1.66 GB | Sweet spot; superficial quality loss at nearly double the speed. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q6_k.bin) | Q6_K | 1.28 GB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q5_k.bin) | Q5_K | 1.08 GB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q5_1.bin) | Q5_1 | 1.18 GB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q5_0.bin) | Q5_0 | 1.08 GB | Last "good" quant; anything below loses quality rapidly. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q4_k.bin) | Q4_K | 889 MB | *Might* not have lost too much quality, but I'm not sure. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q4_1.bin) | Q4_1 | 985 MB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q4_0.bin) | Q4_0 | 889 MB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q3_k.bin) | Q3_K | 685 MB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q2_k.bin) | Q2_K | 529 MB | Completely non-sensical outputs. |
The F16 quant was taken from [ggerganov/whisper.cpp/ggml-large-v2.bin](https://huggingface.co/ggerganov/whisper.cpp/blob/main/ggml-large-v2.bin).
## Questions you may have
### Why do the "K-quants" not work for me?
My guess is that your GPU might be too old to recognize them, considering that I have gotten the same error on my GTX 1080. If you would like to run them regardless, you can try switching to CPU inference.
### Are the K-quants "S", "M", or "L"?
The quantizer I was using was not specific about this, so I do not know about this either.
### What program did you use to make these quants?
I used [whisper.cpp v1.7.6](https://github.com/ggml-org/whisper.cpp/releases/tag/v1.7.6) on Windows x64, leveraging CUDA 12.4.0. For the F32 quant, I converted the original Hugging Face (H5) format model to a GGML using the `models/convert-h5-to-ggml.py` script.
### One or multiple of the quants are not working for me.
[Open a new discussion](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/discussions/new) in the community tab about this, and I will look into the issue. |