The vocab size is different with embedding size?

#1
by zhichao-geng - opened

Thanks for your great work!

I found that the tokenizer.vocab_size is 30522, and the vocab token number is 30522.
However, the embeddings and cls decoder has a shape of (768,30528). Should we just ignore the last 6 coordinates in embedding?

Alibaba-NLP org

Yes.
We pad the embedding to 30528 is to be divisible by 64, which is said could increase the computational efficiency.

Sign up or log in to comment