How to generate multiple tokens at once?

#14
by Wiselnn - opened

Nice work! I have reviewed the repository and noticed that the model is set up to output a single token by default, and the generate.py does not include the multi-token output logic as claimed in the paper. If I want to validate the effectiveness of multi-token output, what should I do? Thanks for your help!

This code is missing the self-speculative sampling part. Could you add it?

Sign up or log in to comment