Spaces:

huggingchat
/

chat-ui

Running

App Files Files Community

756

HuggingChat: Input validation error: `inputs` tokens + `max_new_tokens` must be..

#430

by Kostyak - opened Apr 29, 2024

Discussion

Kostyak

Apr 29, 2024

•

edited Apr 29, 2024

I use the meta-llama/Meta-Llama-3-70B-Instruct model. After a certain number of moves, the AI refuses to walk and gives an error : "Input validation error: inputs tokens + max_new_tokens must be <= 8192. Given: 6391 inputs tokens and 2047 max_new_tokens". Is this a bug or some new limitation? I still don't get it to be honest and I hope I get an answer here. I'm new to this site.

Kostyak changed discussion status to closed Apr 29, 2024

Kostyak changed discussion title from Input validation error: `inputs` tokens + `max_new_tokens` must be.. to HuggingChat: Input validation error: `inputs` tokens + `max_new_tokens` must be.. Apr 29, 2024

Kostyak changed discussion status to open Apr 29, 2024

Kostyak changed discussion status to closed Apr 29, 2024

Kostyak changed discussion status to open Apr 29, 2024

Awillia91

Apr 29, 2024

Same issue all of the sudden today

nsarrazin

Apr 30, 2024

Can you see if this still happens? Should be fixed now.

Kostyak

Apr 30, 2024

This comment has been hidden

Kostyak

Apr 30, 2024

•

edited Apr 30, 2024

Can you see if this still happens? Should be fixed now.

Still same error, except numbers have changed a little.

dashtheman

May 1, 2024

I keep getting this error as well. Using CohereForAI

SnowfieldTerm

May 1, 2024

Same error, "Meta-Llama-3-70B-Instruct" model.

keaparrottg

May 2, 2024

I have also been running into this error. Is there a workaround or solution at all?

"Input validation error: inputs tokens + max_new_tokens must be <= 8192. Given: 6474 inputs tokens and 2047 max_new_tokens"

Using the meta-llama/Meta-Llama-3-70B-Instruct model.

77 hidden messages

Expand all

nsarrazin

Feb 13

@datoreviol @bocahpekael99 if one of you could share one of the conversations where this happen, that would help us a lot with debugging !

JulienGuy

Feb 13

@datoreviol @bocahpekael99

Hi guys,

LLMs have a limited context window, that is, a limited amount of text they can process at once. If this limit is exceeded, you typically get the error you are seeing. The limit in your case is around 16k tokens.

What counts towards this limit is the input text PLUS the output text. The input text is your prompt, which may contain a lot of tokens if you are doing RAG (almost 15k in your case). The output text is what you are asking the LLM to generate as an answer, which is 3072 tokens in your case. So you are basically asking the model to process more text at once than it is able to.

To fix the error, you have to reduce the amount of text you are asking the LLM to process. You can use any of these approaches:

reduce the input size (write a shorter prompt, return fewer chunks from your database if you are doing RAG, have smaller chunks to begin with)
reduce the size of the answer you want the LLM to write. 3072 tokens is kind of a lot for a chatbot or a RAG pipeline, do you really need that much? Try 1024 or 512.
when calling an LLM through some kind of free API from HuggingFace, it seems that the max context window is set to a lower value than what the model can actually deal with. If this is how you are using DeepSeek, consider creating a (paid) dedicated endpoint instead, which would allow you to use a bigger context window (Qwen 32B should support 128k).

Hope this helps.

nsarrazin

Feb 13

TBH this shouldn't be happening, the backend should automatically truncate if you exceed the context window, that's why I wanted a conversation to see where the issue is

dcardoner

Feb 17

What I don't understand is the following:
When input limit is reached how much do we have to wait in order to continue asking questions to the model / agent ??

rayd475

Feb 23

Still happening to me

CuckqueanLesbian

Jun 10

if this message shown to a chat so we have to move on new chat? cause this one doesn't taking more inputs

keaparrottg

Jun 10

I’m using the Google Gemma for a chat and this error has popped up…

Error forwarded from backend: Input validation error: inputs tokens + max_new_tokens must be <= 4096. Given: 4582 inputs tokens and 0 max_new_tokens

Is there a way to fix it to allow the chat to continue or is it dead?

If there is a fix what would I need to do?

Silvertriforce

Jun 30

•

edited Jun 30

I've cleared all my chats and typed L and it is giving me
[Error forwarded from backend: Input validation error: inputs tokens + max_new_tokens must be <= 131072. Given: 227472 inputs tokens and 0 max_new_tokens]
then
L
Yes or No ,
and it gives me
[Error forwarded from backend: Input validation error: inputs tokens + max_new_tokens must be <= 131072. Given: 227476 inputs tokens and 0 max_new_tokens]

:( i've tried app and in browser changing models and its the same all of a sudden.

I RESET MY MODELS AND DELETED MY ASSISTANTS AND ITS WORKING NOW
in a new chat, Any older chats give the same message :(

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment