[DEBUG]transformers 4.38.0 /models/gemma/modeling_gemma.py
#96
by
LiuWhite
- opened
origin: attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
fix: attn_output = attn_output.reshape(bsz, q_len, 4096)
in this place the hidden_size is not equal to head_dim * head_nums
we need to change the value to get through
Sorry for the delayed response. Thank you for providing the configuration details. I am escalating this to the engineering team to ensure this is corrected.