runtime error

rsions of the code file, you can pin a revision. `flash-attention` package not found, consider installing for better performance: No module named 'flash_attn'. Current `flash-attenton` does not support `window_size`. Either upgrade or use `attn_implementation='eager'`. Downloading shards: 0%| | 0/6 [00:00<?, ?it/s] Downloading shards: 17%|β–ˆβ–‹ | 1/6 [00:09<00:48, 9.70s/it] Downloading shards: 33%|β–ˆβ–ˆβ–ˆβ–Ž | 2/6 [00:21<00:43, 10.96s/it] Downloading shards: 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 3/6 [00:31<00:30, 10.33s/it] Downloading shards: 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 4/6 [00:40<00:20, 10.09s/it] Downloading shards: 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 5/6 [00:48<00:09, 9.19s/it] Downloading shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6/6 [00:55<00:00, 8.51s/it] Downloading shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6/6 [00:55<00:00, 9.27s/it] Loading checkpoint shards: 0%| | 0/6 [00:00<?, ?it/s] Loading checkpoint shards: 17%|β–ˆβ–‹ | 1/6 [00:02<00:10, 2.04s/it] Loading checkpoint shards: 33%|β–ˆβ–ˆβ–ˆβ–Ž | 2/6 [00:03<00:06, 1.56s/it] Loading checkpoint shards: 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 3/6 [00:04<00:04, 1.61s/it] Loading checkpoint shards: 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 5/6 [00:06<00:01, 1.17s/it] Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6/6 [00:07<00:00, 1.22s/it] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Traceback (most recent call last): File "/home/user/app/app.py", line 24, in <module> def predict(message, history, temperature, max_tokens, top_p, top_k): File "/usr/local/lib/python3.10/site-packages/spaces/zero/decorator.py", line 113, in _GPU client.startup_report() File "/usr/local/lib/python3.10/site-packages/spaces/zero/client.py", line 45, in startup_report raise RuntimeError("Error while initializing ZeroGPU: Unknown") RuntimeError: Error while initializing ZeroGPU: Unknown

Container logs:

Fetching error logs...