[DEBUG]transformers 4.38.0 /models/gemma/modeling_gemma.py

#96

by LiuWhite - opened Apr 25, 2024

Apr 25, 2024

origin: attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
fix: attn_output = attn_output.reshape(bsz, q_len, 4096)
in this place the hidden_size is not equal to head_dim * head_nums
we need to change the value to get through

Renu11

Google org 22 days ago

Sorry for the delayed response. Thank you for providing the configuration details. I am escalating this to the engineering team to ensure this is corrected.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment