Trim the text encoder weights

by ttj - opened Apr 30

Discussion

ttj

Apr 30

The encode config says 24 layers but if I understand correctly you can set it to 17 and trim the later layers

ideprado

Freepik org Apr 30

You are totally right. This could be done to save some memory and compute... but the highest cost here is the generation. The T5 inference is done just once at the begining, whereas the DiT model is run as many times as steps are done.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment