Getting error when hosted this model on GCP and using large context

#1
by pulkitmehtametacube - opened

Hi All ,

We deployed this model on vertex ai and it works fine for smaller prompts and context but when we pass 16k token paul graham essay in input , getting below error . Please suggest .

google.api_core.exceptions.InternalServerError: 500 {"error":"Incomplete generation","error_type":"Incomplete generation"}

Not sure about vertexAI, never worked with it, try vLLMs maybe

Sign up or log in to comment