Trim the text encoder weights

#2
by ttj - opened

The encode config says 24 layers but if I understand correctly you can set it to 17 and trim the later layers

Freepik org

You are totally right. This could be done to save some memory and compute... but the highest cost here is the generation. The T5 inference is done just once at the begining, whereas the DiT model is run as many times as steps are done.

Sign up or log in to comment