Token Overflow Error
Hi,
I am currently building a Gen AI model which can summarize the uploaded documents or can do QA based on user intent. However, I am facing issue with the Summarization part. I am always getting "Token indices sequence length is longer than the specified maximum sequence length for this model". Please help in correcting my code. I am using "ibm-granite/granite-3.3-2b-instruct" for this model. Please help I am stuck for past few weeks in this only.
Also, pasting my get_summary function for your reference:
def get_summary(text: str) -> str:
splitter = RecursiveCharacterTextSplitter(chunk_size=3000, chunk_overlap=40)
chunks = splitter.split_text(text)
docs = [Document(page_content=c) for c in chunks]
print(f"π¦ Chunks: {len(chunks)}")
for i,c in enumerate(chunks,1):
print(f" Chunk {i}: ~{len(c.split())} words")
# 2. Prompts
map_prompt = PromptTemplate(
input_variables=["text"],
template="""
<|system|>
You are a concise assistant for banking docs.
<|user|>
Summarize in under 400 tokens, no hallucinations:
{text}
<|assistant|>
"""
)
combine_prompt = PromptTemplate(
input_variables=["text"],
template="""
<|system|>
You are a banking reports summarization expert.
<|user|>
Combine chunk summaries into one concise summary (β€800 tokens), strictly factual:
{text}
<|assistant|>
"""
)
# 3. Summarize
chain = load_summarize_chain(
llm=llm,
chain_type="map_reduce",
map_prompt=map_prompt,
combine_prompt=combine_prompt
)
return chain.invoke(docs)["output_text"]
Hi
@AyushBhargav
, thanks for exploring this use case with Granite! Based on your snippets, it looks like you may be running with LangChain. Can you post the full script you're using? My best guess is that when your llm
is being loaded, there's a field that needs to be set to override the max sequence length to match the model's parameters.