tokenizer_config.json and config.json specify wrong EOS token, causing the model to not function correctly in backends which do not read EOS tokens from generation_config.json
When I tried this model in text-generation-webui, it kept adding word "assistant" at the end of each message, or failing to stop and continuing until running out of the token limit. It turns out the issue caused by wrong EOS token in the tokenizer_config.json. I found this bug report: https://github.com/oobabooga/text-generation-webui/issues/5885 and it contained a fix.
The solution turned out to be to edit tokenizer_config.json and replace this line:
"eos_token": "<|end_of_text|>",
With this line:
"eos_token": "<|eot_id|>",
Since this config file is very large, without knowing the string to search, or that I am supposed to search in this file specifically, it wasn't obvious at all. From the bug report, it seems that Meta itself messed up the release, but they corrected later only their generation_config.json but tokenizer_config.json remained broken. As it turned out, text-generation-webui takes the EOS token from it, this is why it wasn't working despite the generation_config.json update, until I found out that tokenizer_config.json needs to be fixed as well.
I also edited config.json to specify correct EOS token ID like this (I replaced 128001 with 128009):
"eos_token_id": 128009,
I see that generation_config.json was already updated in this repository with a fixed version, but I also suggest updating both tokenizer_config.json and config.json to make sure the model works correctly out of the box for everyone.
What's worked for me quite well: https://huggingface.co/LoneStriker/Meta-Llama-3-8B-Instruct-8.0bpw-h8-exl2/discussions/2
Yes, you are correct, I forgot to mention special_tokens_map.json, it needs to be updated as well to use "<|eot_id|>" instead of "<|end_of_text|>":
"eos_token": "<|eot_id|>"
I don't know if this is the correct change to make, it's still a "hack"
The real solution is supporting both, and for now I think the "solution" in text generation webui is to disable "Skip special tokens"
But I'll let turbo answer
With corrected config, the model works everywhere as intended out of the box, not just text generation webui, as far as I can tell. But with json config as it is in the repository right now, I observe sever issues making it unusable, and what's worse most people will have no idea how to fix this, and I also spent a lot of time to find the solution.
If there is no support for both EOS tokens, I think it makes sense to use by default the EOS token that works instead of EOS token that as far as I can tell never works (the model appears to use the second token only occasionally after outputting "assistant"). Of course, I ran only limited tests, but so far everything works very well. If you actually experienced issues after correcting the configs, please share the details.
For the record, the only change I had to make to get it working in TGWUI was in tokenizer_config.json, setting eos_token to <|eot_id|>
I am on dev branch though with 0.0.19 so that could also be a reason, unknown
Any update on this?
Thx