Shape Error in Wan2.2-Fun-5B-Control-Camera

#1
by ChuanhaoLi - opened

Thanks for sharing Wan2.2-Fun-5B-Control-Camera!

When I run the model for inference, I met the following error:
RuntimeError: Given groups=1, weight of size [3072, 100, 1, 2, 2], expected input[1, 48, 21, 30, 52] to have 100 channels, but got 48 channels instead

I found that the shape of patch_embedding.weight is [3072, 100, 1, 2, 2], but the output channel of previous layers is 48. This really confuses me.

Sign up or log in to comment