RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1695392020201/work/c10/cuda/CUDACachingAllocator.cpp":1154, please report a bug to PyTorch.

#1
by stzhao - opened

This is a zeroGPU space for our research project to be released. In this space, I first run a 14B prompt enhancer, then run a 2B t2i model. But when the denoised latent tensor was sent to the VAE decoder, I got this error:

RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1695392020201/work/c10/cuda/CUDACachingAllocator.cpp":1154, please report a bug to PyTorch.

Now I have run out of my quota and can not debug, thus I want to seek the help of the community. @hysts

@hysts really need your help ;)

Hi @stzhao , the error is usually raised when CUDA OOM occurs.
But your Space seems to work fine for me.

It might be unrelated, but it seems that you are calling .to("cuda") in functions decorated with @spaces.GPU https://huggingface.co/spaces/stzhao/LeX-Lumina/blob/f25d2fbc1f356718c4e9ed12c23a61395d28b9d3/app.py#L46, but it's recommended to call it in the global context. For example, https://huggingface.co/spaces/black-forest-labs/FLUX.1-dev/blob/2f733451dcd2c6690953bf03ced2b9d89e6546f3/app.py#L10-L15.

Thanks for the advice, I will check it and modify the code. Yes, the space goes well when the prompt enhancer is not enabled. But in the advanced settings, if you enable the prompt enhancer, you'll get the error I mentioned above. Could you have a try and see how to fix it? Thank you so much!

Ah, I see. Yeah, now I'm getting the error.
BTW, unrelated to the CUDA OOM issue, but gr.Textbox.update in this line needs to be replaced with gr.Textbox.

OK I fix the bug of gr.Textbox.update. But I met another error, sometimes when running the denoising process, I have this error.

image.png

The GPU task aborted error is raised when your function decorated with @spaces.GPU takes longer than the specified duration, so you might want to adjust it.

BTW, regarding the CUDA OOM issue, maybe you can create a separate Space for the enhancer and call it using Gradio API (gradio-client) from the main Space.

Thank you for the advice, I will have a try.

image.png
I have tried your advice and met this error. Could you help me on this :)

Ah, I've forgotten that it's a bit tricky to call ZeroGPU Spaces using gradio-client. Can you try this?

Thank you, let me check this document.

@stzhao I think I've finally figured out the weird gradio-client error. Looks like it's caused by SSR, which is enabled by default on HF Spaces. Could you try setting the GRADIO_SSR_MODE environment variable to False and see if it fixes the issue?

You can set environment variables from the Space Settings.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment