Are there bias weights in Llama3 ?

#202
by Iionbarista - opened

I was looking through the safetensor map file: https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/model.safetensors.index.json

and found that there are no designated weights for biases?

Does Llama have no biases or is it implicitly loaded from the weights?

Or is replaced by the layernorm?

Google Palm paper mentioned:

No biases were used in any of the dense kernels or layer norms. We found this to result in increased training stability for large models.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment