Wan2.1 14B fp32 ggufs are corrupt? Request for bf16 / f16 / special quantizations, fun model and proper text_encoder&VAE&workflows.
oh. Didn't see it. Thanks!
I was searching for 14b. I will try to merge it.
I used gguf merge
and got wan2.1-t2v-14b-f32.gguf
file with sha256sum 17f325de403a83e8780120b47b5517ec4858ff70b31762eef7ee77e6113f49f5
With your ComfyUi nodes and this workflow I get patch_embedding.weight
output of LoaderGGUF
.
What am I doing wrong? Can you try to load fp32 gguf to ComfyUi of this model? Also check sha256sum to prove that we are having the same file.
Q8_0 works just okay, but i am interested first in f32 and bf16/f16 will be also better then Q8_0.
gguf qtypes: F32 (695), Q8_0 (400)
model_type FLOW
Requested to load WAN21
loaded completely 21151.76951171875 15271.699462890625 True
it works; btw, what kind of tool(s) that you used to merge it? you could use ggc m2
from gguf-connector; please see similar instruction here; for the f16/bf16, guess you could use convertor zero
to make it
what kind of tool(s) that you used to merge it?
ggc m2
gives me the same file (sha256) as gguf merge
ggc m2
gives me the same file (sha256) asgguf merge
yes, same code base
With your ComfyUi nodes and this workflow I get
patch_embedding.weight
output ofLoaderGGUF
.
patch_embedding.weight
is the d5 tensor; oh, it might be taken out from this f32 file for quant; let me check; btw, you could use convertor zero
to convert the original safetensors to gguf and it should work
this program https://gist.github.com/crasm/41b5b11111d2f2419b31da159fa77447 also ouputs file with same hash.
this program https://gist.github.com/crasm/41b5b11111d2f2419b31da159fa77447 also ouputs file with same hash.
this binary merge is very popular and simple; should work the same way; ok, uploading the whole f32 file again takes a while and we don't think many people need it; let's tell you how to fix it below:
- download the file contains only
patch_embedding.weight
tensor - pull it to the same directory as your merged f32 gguf located
- simply execute
ggc d5
then you will get the fixed f32 gguf withpatch_embedding.weight
tensor inside
I can try to upload f32 and make a PR to this repo so you will only need to click a button to accept it to place fixed ggufs
I am doing now huggingface-cli upload calcuis/wan-gguf fixed --create-pr
My internet bandwidth is currently saturated. I'll try checking other files in your repository. The presence of the tensor should also be visible through the website's graphical interface. How about we create an organization together and coordinate our efforts, since we're both working on making neural networks more accessible to people?
wan2.1-t2v-14b-f32-00002-of-00002.gguf is uploaded, waiting for Nr. 1, then PR will be created and opened
done. Now I will start uploading weights for i2v 720p
btw, there is a bug:
ggc d5 GGUF file(s) available. Select which one to fix:
1. wan2.1-i2v-720p-14b-f32-00002-of-00004.gguf
2. wan2.1-i2v-720p-14b-f32-00004-of-00004.gguf
3. wan2.1-i2v-720p-14b-f32.gguf
4. wan2.1-i2v-720p-14b-f32-00001-of-00004.gguf
5. wan2.1-i2v-720p-14b-f32-00003-of-00004.gguf
Enter your choice (1 to 5): 3
Model file: wan2.1-i2v-720p-14b-f32.gguf is selected!
Invalid choice. Please enter a valid number.
Or something is wrong with that file: 205522ca083ae822fe5051416f42b0a8657a9fd2fed438b81fa54570681fc736 wan2.1-i2v-720p-14b-f32.gguf
https://huggingface.co/calcuis/wan-gguf/blob/main/wan2.1-i2v-480p-14b-f32-00001-of-00004.gguf has no needed .weight, I am doing ggc d5 with it
done. Uploading
Wan2.1-T2V-14B fp32 loads and works in ComfyUi (tested). Speed is the same as Q8_0 but quality is better. I will try to compare amount of artifacts. May be in result I will create special quantization that creates a minimum possible artifacts, for example Q8_L or just Q8_0 with bf16 output tensor.
Why is VAE called wan_2.1_vae_fp32-f16.gguf ? 253,9mb
in Wan-AI/Wan2.1-T2V-14B VAE.pth is 508mb and original dtype is important there.
Seems like nobody yet uploaded F32 VAE GGUF for Wan2.1 14B
I am checking original .pth VAE now, but I am sure it will be much better.
Oh no, the main problem is not in VAE. Looks something bad with sampler in your workflows.
Maybe sampling shift is needed
only you is providing sample generations for ggufs. But quality is bad if i compare it with other samples. I will do some experiments and maybe I will figure out which settings are the best
btw, there is a bug:
ggc d5 GGUF file(s) available. Select which one to fix: 1. wan2.1-i2v-720p-14b-f32-00002-of-00004.gguf 2. wan2.1-i2v-720p-14b-f32-00004-of-00004.gguf 3. wan2.1-i2v-720p-14b-f32.gguf 4. wan2.1-i2v-720p-14b-f32-00001-of-00004.gguf 5. wan2.1-i2v-720p-14b-f32-00003-of-00004.gguf Enter your choice (1 to 5): 3 Model file: wan2.1-i2v-720p-14b-f32.gguf is selected! Invalid choice. Please enter a valid number.
this is not a bug; the d5 tensor might not be able to reuse for different size model(s); and according to the file sorting logic, wan2.1-i2v-720-14b-f32.gguf
will not be pasted in the middle, maybe something wrong there; anyway the new scheme we don't need the fixer, just transfer the tensor straight from another gguf file
anyway the new scheme we don't need the fixer
Do you mean creating wan2.1-i2v-720-14b-f32.gguf
from safetensors? Otherwise I do not understand how to fix that.
upd: do you mean to try export this tensor from ggufs like Q8_0? But it is not F32. It will be better if (safe)tensor from original Wan2.1 repo will be used to merge into gguf. (Currently trying this)
upd2: exported:
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/patch_embedding-wan2.1-i2v-720-14b.weight 7890f9ca785e0a06e6d2f5b7d5704377a8d96deb13ccfc09e9a1ed0a3dece328
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/patch_embedding-wan2.1-flf2v-720p-14b.weight aff6caaf912c261e9c2fdd5a44a1c0f741003c042d98d474fc02b8d103df4083
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/patch_embedding-Wan2.1-I2V-14B-480P.weight 34e498ce6dc3af47cff95d1a0e68b8a5c4dc7509cf0fc09f862ad4543b506f0b
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/fix_5d_tensors_pig.safetensors 129052c743034bed9770533df893dc618aa9c66ca7843736eab82a9a8bcbfaae
https://huggingface.co/GeorgyGUF/wan2.1-special-tensors-ggufs/blob/main/patch_embedding-Wan2.1-T2V-14B.weight a55e33839c08f12a61bd26bef1ce9d8ecb35be7c357c9ddb55926c20340c17b6
upd3: still same with ggc d5 or even something new: _pickle.UnpicklingError: invalid load key, '\xf0'.
Your repo provides text encoder in fp16 quant. Do you know that this difference is also important? Wan2.1 originally uses BF16 encoder, so I thing it will be helpful to have it here. Or atleast info in REAME.md Maybe I will create a PR with that notice too.
wan2.1-i2v-480p-14b-f32-00002-of-00002.gguf is uploaded. Waiting for first part.
Your workflows have uni_pc sampler in use, but uni_pc_bh2 is better for Wan2.1. Maybe I will contribute in this too.
It will be great to inform users that DPM++2m sampler with sgm_uniform scheduler are often best, less steps more quality.
Or even a combination like where you use gradient estimation sampler for first 15 steps and euler for the final steps. It can give more sharp and consistent images for i2v.
wan2.1-flf2v-720p-14b-f32.gguf has no patch_embedding.weight
tensor too. I prepare it for fixing.
Second PR done. Maybe I will check how proper it works. Stay tuned.
Looks like broken. I will try to figure out.
I need to convert original safetensors to ggufs.
thanks; not really necessary to fix the f32, since it was the source of those q2 to q8 quants; and everyone could get the d5 tensor inside those files; d5 tensor is very common especially in vae file; which is not a problem though; welcome to help out, you could study those tools inside the gguf-connector first, then manage to do a good quant eventually
https://reddit.com/comments/1kx41m0/comment/mumhr39 check this. Number of frames has a dependency to amount of artifacts.
Yes. For me, it always happens in T2V for videos less than 33 frames (euler/simple, 20 steps, 480p, no causvid lora). With causvid lora, it's still there but a lot less noticeable. I noticed that uni_pc seems to be better in that regard, but I don't like it in all other regards compared to euler.