Spaces:
Running
on
Zero
Getting error in image generation
Congratulations @MoonQiu on the app release, the UI looks slick!
I am unable to generate an image atm and am getting this error, any help?
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 541, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 276, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1928, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1514, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2505, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 1005, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 833, in wrapper
response = f(*args, **kwargs)
File "/home/user/app/app.py", line 43, in infer
result = infer_gpu_part(pipe, seed, prompt, negative_prompt, ddim_steps, guidance_scale, resolutions_list, fast_mode, cosine_scale, disable_freeu)
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 208, in gradio_handler
res = worker.res_queue.get()
File "/usr/local/lib/python3.10/multiprocessing/queues.py", line 367, in get
return _ForkingPickler.loads(res)
TypeError: StableDiffusionXLPipelineOutput.init() missing 1 required positional argument: 'images'
Hi, thanks for your interest. This demo is still under testing. I think this problem may be caused by GPU switching on ZERO but I need more time to check it due to limited GPU quota.
@ysharma @hysts Hi, do you have any suggestions for speeding up the .cuda() operation? Since GPUZero needs to load the model to GPU in each inference, it is really time-consuming. After I use the SDXL-turbo checkpoints, the inference time of 2048x2048 images is around 10s after the model is loaded. However, the model loader and GPU loader will take 40s.
@MoonQiu
You can call pipe.to("cuda")
outside of functions with @spaces.GPU
. CUDA is only available inside functions decorated with it on ZeroGPU, but ZeroGPU sort of remembers that .to("cuda")
is called and automatically moves models to GPU when the decorated function is called. For example, you might want to take a look at this: https://huggingface.co/spaces/black-forest-labs/FLUX.1-dev/blob/2f733451dcd2c6690953bf03ced2b9d89e6546f3/app.py#L11-L15
When the function is executed for the first time, there is some overhead for loading the model, but the model remains on the GPU for a while after the function finishes, so the execution becomes faster afterwards. (The model will be offloaded from the GPU again after a certain amount of time since the last execution.)
@MoonQiu
You can callpipe.to("cuda")
outside of functions with@spaces.GPU
. CUDA is only available inside functions decorated with it on ZeroGPU, but ZeroGPU sort of remembers that.to("cuda")
is called and automatically moves models to GPU when the decorated function is called. For example, you might want to take a look at this: https://huggingface.co/spaces/black-forest-labs/FLUX.1-dev/blob/2f733451dcd2c6690953bf03ced2b9d89e6546f3/app.py#L11-L15
When the function is executed for the first time, there is some overhead for loading the model, but the model remains on the GPU for a while after the function finishes, so the execution becomes faster afterwards. (The model will be offloaded from the GPU again after a certain amount of time since the last execution.)
Thanks. This operation saves much time.