runtime error

GPU CUDA not found. /home/user/.local/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32 /home/user/.local/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " Downloading shards: 0%| | 0/2 [00:00<?, ?it/s] Downloading shards: 50%|█████ | 1/2 [01:02<01:02, 62.19s/it] Downloading shards: 100%|██████████| 2/2 [01:25<00:00, 39.11s/it] Downloading shards: 100%|██████████| 2/2 [01:25<00:00, 42.57s/it] Traceback (most recent call last): File "/home/user/app/app.py", line 418, in <module> main() File "/home/user/app/app.py", line 60, in main llama2_wrapper = LLAMA2_WRAPPER( File "/home/user/app/llama2_wrapper/model.py", line 99, in __init__ self.init_model() File "/home/user/app/llama2_wrapper/model.py", line 103, in init_model self.model = LLAMA2_WRAPPER.create_llama2_model( File "/home/user/app/llama2_wrapper/model.py", line 146, in create_llama2_model model = AutoModelForCausalLM.from_pretrained( File "/home/user/.local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained return model_class.from_pretrained( File "/home/user/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( File "/home/user/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3002, in _load_pretrained_model raise ValueError( ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format.

Container logs:

Fetching error logs...