Jaron's picture

4 1 4

Jaron

JaronTHU

·

AI & ML interests

None yet

Organizations

JaronTHU's activity

New activity in internlm/internlm3-8b-instruct 4 months ago

Fast Tokenizer

#17 opened 5 months ago by

New activity in google/gemma-2-9b-it 11 months ago

Question about lm_head weights in Gemma-2-9b-it model

#34 opened 11 months ago by

Fails to generate with `inputs_embeds`

#18 opened 11 months ago by

"It is strongly recommended to train Gemma2 models with the `eager` attention implementation "

#10 opened 12 months ago by