Fix eos_token in tokenizer_config.json

#1
by AlexCheema - opened
MLX Community org

eos_token was set to "<|eot_id|>" when it should be "<|end_of_text|>"
This caused bugs like generation never ending because it never hits the eos_token.

MLX Community org

Instruct models don't have this issue.

It's only the base models.

prince-canuma changed pull request status to closed

I have the same issue.
when setting the max_tokens to 1024 the model doesn't stop generating.

MLX Community org

Please give me a reproducible example :)

hi,
As suggested in the model card, but with max_tokens=1024-

'''
from mlx_lm import load, generate

model, tokenizer = load("mlx-community/Meta-Llama-3-8B-Instruct-4bit")
response = generate(model, tokenizer, prompt="hello", verbose=True, max_tokens=1024)
'''

The model does not stop generating text, as shown in the next discussion: huggingface.co/mlx-community/Meta-Llama-3-8B-Instruct-4bit/discussions/3

Hey @prince-canuma

Here's the sample code:

from mlx_lm import load, generate
from markdown import markdown
from IPython.display import Markdown, display

model, tokenizer = load("mlx-community/Meta-Llama-3-8B-Instruct-4bit")

response = generate(model, tokenizer, 
                prompt="What is 5 plus 5?",
                verbose=True,
                max_tokens=400)

And attached is the response it's giving me.

image.png

MLX Community org

Can you run the same command using the terminal and share the results?

python -m mlx_lm.generate --model ... --prompt "..."
prince-canuma changed pull request status to open
MLX Community org

Should setting eos_token to "<|end_of_text|>" in configuration files do the trick? It doesn't work for me unfortunately.

Ready to merge
This branch is ready to get merged automatically.
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment