TencentARC/PhotoMaker · [ BUG ] An error is reported directly without any information.

ARC Lab, Tencent PCG org 21 days ago

Hi @hysts
We are encountering a strange problem. After clicking Generate, an error is reported directly without any information. How to solve it?
I have restarted the space several times but still haven't solved the problem.

hysts

21 days ago

Hi @Paper99 Hmm, it's weird. When I duplicated the Space and ran it on ZeroGPU, I got the following error. Does this error ring a bell?

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 216, in thread_wrapper
    res = future.result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/user/app/app.py", line 76, in generate_image
    images = pipe(
  File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/user/app/pipeline.py", line 420, in __call__
    noise_pred = self.unet(
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 1112, in forward
    sample, res_samples = downsample_block(
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 1160, in forward
    hidden_states = attn(
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/diffusers/models/transformer_2d.py", line 392, in forward
    hidden_states = block(
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/diffusers/models/attention.py", line 329, in forward
    attn_output = self.attn1(
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 527, in forward
    return self.processor(
  File "/usr/local/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 1239, in __call__
    query = attn.to_q(hidden_states, *args)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/peft/tuners/lora/layer.py", line 495, in forward
    result = self.base_layer(x, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 495, in call_prediction
    output = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1561, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1179, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 177, in gradio_handler
    raise res.value
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

The code doesn't seem to be updated recently and the Space had been working fine before, so it's weird, but maybe it's due to some updates of dependency libraries?

Paper99

ARC Lab, Tencent PCG org 21 days ago

Hi @hysts ,
Thank you for your quick response. I will proceed with debugging in the updataed environment.

Paper99 changed discussion title from [ BUG ] An error is reported directly without any information. to [ BUG ] An error is reported directly without any information. (WILL RESOLVED WITHIN A DAY) 21 days ago

hysts

21 days ago

@Paper99 I just tried adding pipe.to(device) after this line, and it seemed to fix the issue. Not exactly sure what was the underlying cause, though.

Paper99

ARC Lab, Tencent PCG org 21 days ago

Thanks for your advice and help! It's working now.

But what's strange to me is that I've moved the device on this line of code.
This seems to be caused by the lack of automatic device conversion for LoRA when calling fuse_lora. This may require the attention of those maintaining diffusers.

hysts

21 days ago

Yeah, I guessed there was some change in implementation of .fuse_lora() and tried it. Maybe @sayakpaul @YiYiXu might know something.

sayakpaul

21 days ago

Can we maybe try to use the most recent version of peft and diffusers and see if this persists?

Paper99 changed discussion title from [ BUG ] An error is reported directly without any information. (WILL RESOLVED WITHIN A DAY) to [ BUG ] An error is reported directly without any information. 21 days ago

Paper99

ARC Lab, Tencent PCG org 21 days ago

OK, I'll give it a try and provide feedback.

Paper99

ARC Lab, Tencent PCG org 16 days ago

After trying the latest version of peft and diffusers, the bug is fixed.
Thanks a lot!