The vocab size is different with embedding size?
#1
by
zhichao-geng
- opened
Thanks for your great work!
I found that the tokenizer.vocab_size is 30522, and the vocab token number is 30522.
However, the embeddings and cls decoder has a shape of (768,30528). Should we just ignore the last 6 coordinates in embedding?
Yes.
We pad the embedding to 30528 is to be divisible by 64, which is said could increase the computational efficiency.