More advanced parameters exist for the | |
[generate] method, which gives you even further control over the [generate] method's behavior. | |
For the complete list of the available parameters, refer to the API documentation. | |
Speculative Decoding | |
Speculative decoding (also known as assisted decoding) is a modification of the decoding strategies above, that uses an | |
assistant model (ideally a much smaller one) with the same tokenizer, to generate a few candidate tokens. |