Papers
arxiv:2505.14827

Text Generation Beyond Discrete Token Sampling

Published on May 20
· Submitted by yzhuang on May 22
Authors:
,
,
,
,

Abstract

Mixture of Inputs (MoI), a training-free method, enhances autoregressive generation by maintaining a richer internal representation, improving text quality and reasoning capabilities in mathematical reasoning, code generation, and PhD-level QA tasks.

AI-generated summary

In standard autoregressive generation, an LLM predicts the next-token distribution, samples a discrete token, and then discards the distribution, passing only the sampled token as new input. To preserve this distribution's rich information, we propose Mixture of Inputs (MoI), a training-free method for autoregressive generation. After generating a token following the standard paradigm, we construct a new input that blends the generated discrete token with the previously discarded token distribution. Specifically, we employ a Bayesian estimation method that treats the token distribution as the prior, the sampled token as the observation, and replaces the conventional one-hot vector with the continuous posterior expectation as the new model input. MoI allows the model to maintain a richer internal representation throughout the generation process, resulting in improved text quality and reasoning capabilities. On mathematical reasoning, code generation, and PhD-level QA tasks, MoI consistently improves performance across multiple models including QwQ-32B, Nemotron-Super-49B, Gemma-3-27B, and DAPO-Qwen-32B, with no additional training and negligible computational overhead.

Community

Paper submitter

🤯Your LLM just threw away 99.9 % of what it knows.

Standard decoding samples one token at a time and discards the rest of the probability mass.

Mixture of Inputs (MoI) rescues that lost information, feeding it back for more nuanced expressions.

It is a brand new inference-time strategy!

  • No extra training
  • No architecture changes
  • Up to 10% error reduction on AIME

📕Paper: https://arxiv.org/abs/2505.14827
💻Code: https://github.com/EvanZhuang/mixinputs

try it with: pip install mixinputs

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2505.14827 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.14827 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 1