OOM happened when inference on single A100(40G),
#1
by
QGM0413
- opened
When running the example inference for VLM2Vec, OOM error occurred. However, the same operation with LLaVA-NeXT (7B) runs smoothly without any issues. What are the key differences between VLM2Vec and LLaVA-NeXT (7B) that could be causing the OOM error?
@QGM0413
Hi, based on my observation, VLM2Vec and LlaVA-Next have similar memory usage. Could you try the following to debug: 1) reducing the inference batch size, or 2) resizing the image to a lower resolution?