calcuis/wan-gguf · Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations, fun model and proper text

GeorgyGUF

May 17

Can you please add fp32 GGUF for https://huggingface.co/Wan-AI/Wan2.1-T2V-14B ?

calcuis

Owner May 17

•

edited May 17

you mean this one or this; t2v-14b-f32 already inside this repo but you need to merge it yourself since the total size is larger than the upload file size limit - 50GB (please see part 1 and part 2)

GeorgyGUF

May 17

oh. Didn't see it. Thanks!

GeorgyGUF changed discussion status to closed May 17

GeorgyGUF

May 17

•

edited May 17

I was searching for 14b. I will try to merge it.

GeorgyGUF changed discussion status to open May 18

GeorgyGUF

May 18

I used gguf merge and got wan2.1-t2v-14b-f32.gguffile with sha256sum 17f325de403a83e8780120b47b5517ec4858ff70b31762eef7ee77e6113f49f5
With your ComfyUi nodes and this workflow I get patch_embedding.weight output of LoaderGGUF.
What am I doing wrong? Can you try to load fp32 gguf to ComfyUi of this model? Also check sha256sum to prove that we are having the same file.

GeorgyGUF changed discussion title from Wan2.1-T2V-14B fp32 gguf? to Wan2.1-T2V-14B fp32 gguf is corrupt? Request for bf16 / f16 quantizations. May 18

GeorgyGUF

May 18

Q8_0 works just okay, but i am interested first in f32 and bf16/f16 will be also better then Q8_0.

gguf qtypes: F32 (695), Q8_0 (400)
model_type FLOW
Requested to load WAN21
loaded completely 21151.76951171875 15271.699462890625 True

calcuis

Owner May 18

•

edited May 18

it works; btw, what kind of tool(s) that you used to merge it? you could use ggc m2 from gguf-connector; please see similar instruction here; for the f16/bf16, guess you could use convertor zero to make it

GeorgyGUF changed discussion title from Wan2.1-T2V-14B fp32 gguf is corrupt? Request for bf16 / f16 quantizations. to Wan2.1-T2V-14B fp32 gguf is corrupt? Request for bf16 / f16 quantizations and fun model. May 18

GeorgyGUF

May 19

what kind of tool(s) that you used to merge it?

this: https://github.com/calcuis/gguf-core#gguf-merger

GeorgyGUF

May 19

ggc m2 gives me the same file (sha256) as gguf merge

calcuis

Owner May 19

ggc m2 gives me the same file (sha256) as gguf merge

yes, same code base

calcuis

Owner May 19

With your ComfyUi nodes and this workflow I get patch_embedding.weight output of LoaderGGUF.

patch_embedding.weight is the d5 tensor; oh, it might be taken out from this f32 file for quant; let me check; btw, you could use convertor zero to convert the original safetensors to gguf and it should work

GeorgyGUF

May 19

this program https://gist.github.com/crasm/41b5b11111d2f2419b31da159fa77447 also ouputs file with same hash.

calcuis

Owner May 19

•

edited May 19

this program https://gist.github.com/crasm/41b5b11111d2f2419b31da159fa77447 also ouputs file with same hash.

this binary merge is very popular and simple; should work the same way; ok, uploading the whole f32 file again takes a while and we don't think many people need it; let's tell you how to fix it below:

download the file contains only patch_embedding.weight tensor
pull it to the same directory as your merged f32 gguf located
simply execute ggc d5 then you will get the fixed f32 gguf with patch_embedding.weight tensor inside

GeorgyGUF

May 19

I can try to upload f32 and make a PR to this repo so you will only need to click a button to accept it to place fixed ggufs

GeorgyGUF

May 19

I am doing now huggingface-cli upload calcuis/wan-gguf fixed --create-pr

GeorgyGUF

May 19

My internet bandwidth is currently saturated. I'll try checking other files in your repository. The presence of the tensor should also be visible through the website's graphical interface. How about we create an organization together and coordinate our efforts, since we're both working on making neural networks more accessible to people?

GeorgyGUF

May 19

wan2.1-i2v-720p-14b-f32.gguf has no patch_embedding.weight too. So I will try to fix that also.

GeorgyGUF changed discussion title from Wan2.1-T2V-14B fp32 gguf is corrupt? Request for bf16 / f16 quantizations and fun model. to Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 quantizations and fun model. May 19

GeorgyGUF

May 19

wan2.1-t2v-14b-f32-00002-of-00002.gguf is uploaded, waiting for Nr. 1, then PR will be created and opened

GeorgyGUF

May 19

done. Now I will start uploading weights for i2v 720p

GeorgyGUF

May 19

btw, there is a bug:

ggc d5          GGUF file(s) available. Select which one to fix:
1. wan2.1-i2v-720p-14b-f32-00002-of-00004.gguf
2. wan2.1-i2v-720p-14b-f32-00004-of-00004.gguf
3. wan2.1-i2v-720p-14b-f32.gguf
4. wan2.1-i2v-720p-14b-f32-00001-of-00004.gguf
5. wan2.1-i2v-720p-14b-f32-00003-of-00004.gguf
Enter your choice (1 to 5): 3
Model file: wan2.1-i2v-720p-14b-f32.gguf is selected!
Invalid choice. Please enter a valid number.

GeorgyGUF

May 19

Or something is wrong with that file: 205522ca083ae822fe5051416f42b0a8657a9fd2fed438b81fa54570681fc736 wan2.1-i2v-720p-14b-f32.gguf

GeorgyGUF

May 19

•

edited May 19

https://huggingface.co/calcuis/wan-gguf/blob/main/wan2.1-i2v-480p-14b-f32-00001-of-00004.gguf has no needed .weight, I am doing ggc d5 with it

GeorgyGUF

May 19

done. Uploading

GeorgyGUF

May 19

Wan2.1-T2V-14B fp32 loads and works in ComfyUi (tested). Speed is the same as Q8_0 but quality is better. I will try to compare amount of artifacts. May be in result I will create special quantization that creates a minimum possible artifacts, for example Q8_L or just Q8_0 with bf16 output tensor.

GeorgyGUF changed discussion title from Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 quantizations and fun model. to Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations and fun model. May 19

GeorgyGUF

May 19

Why is VAE called wan_2.1_vae_fp32-f16.gguf ? 253,9mb

GeorgyGUF changed discussion title from Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations and fun model. to Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations, fun model and proper VAE.. May 19

GeorgyGUF changed discussion title from Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations, fun model and proper VAE.. to Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations, fun model and proper VAE. May 19

GeorgyGUF

May 19

in Wan-AI/Wan2.1-T2V-14B VAE.pth is 508mb and original dtype is important there.

GeorgyGUF

May 19

Seems like nobody yet uploaded F32 VAE GGUF for Wan2.1 14B

GeorgyGUF

May 19

I am checking original .pth VAE now, but I am sure it will be much better.

GeorgyGUF changed discussion status to closed May 19

GeorgyGUF changed discussion status to open May 19

GeorgyGUF

May 19

Oh no, the main problem is not in VAE. Looks something bad with sampler in your workflows.

GeorgyGUF

May 19

Maybe sampling shift is needed

GeorgyGUF

May 19

only you is providing sample generations for ggufs. But quality is bad if i compare it with other samples. I will do some experiments and maybe I will figure out which settings are the best

calcuis

Owner May 19

•

edited May 19

btw, there is a bug:

ggc d5          GGUF file(s) available. Select which one to fix:
1. wan2.1-i2v-720p-14b-f32-00002-of-00004.gguf
2. wan2.1-i2v-720p-14b-f32-00004-of-00004.gguf
3. wan2.1-i2v-720p-14b-f32.gguf
4. wan2.1-i2v-720p-14b-f32-00001-of-00004.gguf
5. wan2.1-i2v-720p-14b-f32-00003-of-00004.gguf
Enter your choice (1 to 5): 3
Model file: wan2.1-i2v-720p-14b-f32.gguf is selected!
Invalid choice. Please enter a valid number.

this is not a bug; the d5 tensor might not be able to reuse for different size model(s); and according to the file sorting logic, wan2.1-i2v-720-14b-f32.gguf will not be pasted in the middle, maybe something wrong there; anyway the new scheme we don't need the fixer, just transfer the tensor straight from another gguf file

GeorgyGUF

May 19

•

edited May 19

anyway the new scheme we don't need the fixer

Do you mean creating wan2.1-i2v-720-14b-f32.gguf from safetensors? Otherwise I do not understand how to fix that.

upd: do you mean to try export this tensor from ggufs like Q8_0? But it is not F32. It will be better if (safe)tensor from original Wan2.1 repo will be used to merge into gguf. (Currently trying this)

upd2: exported:
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/patch_embedding-wan2.1-i2v-720-14b.weight 7890f9ca785e0a06e6d2f5b7d5704377a8d96deb13ccfc09e9a1ed0a3dece328
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/patch_embedding-wan2.1-flf2v-720p-14b.weight aff6caaf912c261e9c2fdd5a44a1c0f741003c042d98d474fc02b8d103df4083
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/patch_embedding-Wan2.1-I2V-14B-480P.weight 34e498ce6dc3af47cff95d1a0e68b8a5c4dc7509cf0fc09f862ad4543b506f0b
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/fix_5d_tensors_pig.safetensors 129052c743034bed9770533df893dc618aa9c66ca7843736eab82a9a8bcbfaae
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/patch_embedding-Wan2.1-T2V-14B.weight a55e33839c08f12a61bd26bef1ce9d8ecb35be7c357c9ddb55926c20340c17b6

upd3: still same with ggc d5 or even something new: _pickle.UnpicklingError: invalid load key, '\xf0'.

GeorgyGUF changed discussion title from Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations, fun model and proper VAE. to Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations, fun model and proper text_encoder&VAE. May 19

GeorgyGUF

May 19

Your repo provides text encoder in fp16 quant. Do you know that this difference is also important? Wan2.1 originally uses BF16 encoder, so I thing it will be helpful to have it here. Or atleast info in REAME.md Maybe I will create a PR with that notice too.

GeorgyGUF

May 19

wan2.1-i2v-480p-14b-f32-00002-of-00002.gguf is uploaded. Waiting for first part.

GeorgyGUF changed discussion title from Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations, fun model and proper text_encoder&VAE. to Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations, fun model and proper text_encoder&VAE&workflows. May 19

GeorgyGUF

May 19

Your workflows have uni_pc sampler in use, but uni_pc_bh2 is better for Wan2.1. Maybe I will contribute in this too.

GeorgyGUF

May 19

It will be great to inform users that DPM++2m sampler with sgm_uniform scheduler are often best, less steps more quality.

GeorgyGUF

May 19

Or even a combination like where you use gradient estimation sampler for first 15 steps and euler for the final steps. It can give more sharp and consistent images for i2v.

GeorgyGUF

May 19

wan2.1-flf2v-720p-14b-f32.gguf has no patch_embedding.weight tensor too. I prepare it for fixing.

GeorgyGUF

May 19

Second PR done. Maybe I will check how proper it works. Stay tuned.

GeorgyGUF

May 19

Looks like broken. I will try to figure out.

GeorgyGUF

May 19

I need to convert original safetensors to ggufs.

calcuis

Owner May 20

thanks; not really necessary to fix the f32, since it was the source of those q2 to q8 quants; and everyone could get the d5 tensor inside those files; d5 tensor is very common especially in vae file; which is not a problem though; welcome to help out, you could study those tools inside the gguf-connector first, then manage to do a good quant eventually

GeorgyGUF

May 28

https://reddit.com/comments/1kx41m0/comment/mumhr39 check this. Number of frames has a dependency to amount of artifacts.

Yes. For me, it always happens in T2V for videos less than 33 frames (euler/simple, 20 steps, 480p, no causvid lora). With causvid lora, it's still there but a lot less noticeable. I noticed that uni_pc seems to be better in that regard, but I don't like it in all other regards compared to euler.