File size: 4,240 Bytes
0767a3a
 
 
233a03e
0767a3a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
233a03e
0767a3a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
233a03e
0767a3a
 
 
 
 
 
3c81799
0767a3a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
license: apache-2.0
quantized_by: Pomni
language:
- en
- zh
- de
- es
- ru
- ko
- fr
- ja
- pt
- tr
- pl
- ca
- nl
- ar
- sv
- it
- id
- hi
- fi
- vi
- he
- uk
- el
- ms
- cs
- ro
- da
- hu
- ta
- 'no'
- th
- ur
- hr
- bg
- lt
- la
- mi
- ml
- cy
- sk
- te
- fa
- lv
- bn
- sr
- az
- sl
- kn
- et
- mk
- br
- eu
- is
- hy
- ne
- mn
- bs
- kk
- sq
- sw
- gl
- mr
- pa
- si
- km
- sn
- yo
- so
- af
- oc
- ka
- be
- tg
- sd
- gu
- am
- yi
- lo
- uz
- fo
- ht
- ps
- tk
- nn
- mt
- sa
- lb
- my
- bo
- tl
- mg
- as
- tt
- haw
- ln
- ha
- ba
- jw
- su
base_model:
- openai/whisper-large-v2
pipeline_tag: automatic-speech-recognition
tags:
- whisper.cpp
- ggml
- whisper
- audio
- speech
- voice
new_version: Pomni/whisper-large-v3-ggml-allquants
---
# Whisper-Large-v2 quants
This is a repository of **GGML quants for [whisper-large-v2](https://huggingface.co/openai/whisper-large-v2)**, for use with [whisper.cpp](https://github.com/ggml-org/whisper.cpp).

If you are looking for a program to run this model with, then I would recommend [EasyWhisper UI](https://github.com/mehtabmahir/easy-whisper-ui), as it is user-friendly, has a GUI, and will automate a lot of the hard stuff for you.
## List of Quants
Clicking on a link will download the corresponding quant instantly.
| Link | Quant | Size | Notes
|:-----|:-----|--------:|:------|
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-f32.bin) | F32 | 6.17 GB | Likely overkill. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-f16.bin) | F16 | 3.09 GB | Performs better than Q8_0 for noisy audio and music. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q8_0.bin) | Q8_0 | 1.66 GB | Sweet spot; superficial quality loss at nearly double the speed. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q6_k.bin) | Q6_K | 1.28 GB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q5_k.bin) | Q5_K | 1.08 GB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q5_1.bin) | Q5_1 | 1.18 GB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q5_0.bin) | Q5_0 | 1.08 GB | Last "good" quant; anything below loses quality rapidly. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q4_k.bin) | Q4_K | 889 MB | *Might* not have lost too much quality, but I'm not sure. |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q4_1.bin) | Q4_1 | 985 MB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q4_0.bin) | Q4_0 | 889 MB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q3_k.bin) | Q3_K | 685 MB | |
| [GGML](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/resolve/main/ggml-large-v2-q2_k.bin) | Q2_K | 529 MB | Completely non-sensical outputs. |

The F16 quant was taken from [ggerganov/whisper.cpp/ggml-large-v2.bin](https://huggingface.co/ggerganov/whisper.cpp/blob/main/ggml-large-v2.bin).
## Questions you may have
### Why do the "K-quants" not work for me?
My guess is that your GPU might be too old to recognize them, considering that I have gotten the same error on my GTX 1080. If you would like to run them regardless, you can try switching to CPU inference.
### Are the K-quants "S", "M", or "L"?
The quantizer I was using was not specific about this, so I do not know about this either.
### What program did you use to make these quants?
I used [whisper.cpp v1.7.6](https://github.com/ggml-org/whisper.cpp/releases/tag/v1.7.6) on Windows x64, leveraging CUDA 12.4.0. For the F32 quant, I converted the original Hugging Face (H5) format model to a GGML using the `models/convert-h5-to-ggml.py` script.
### One or multiple of the quants are not working for me.
[Open a new discussion](https://huggingface.co/Pomni/whisper-large-v2-ggml-allquants/discussions/new) in the community tab about this, and I will look into the issue.