How much VRAM does this need to run locally?

by XeIaso - opened 13 days ago

Discussion

XeIaso

13 days ago

It should be at least 30 GB of vram, right?

poeroz

Natural Language Processing Group, Institute of Computing Technology, Chinese Academy of Science org 13 days ago

It needs ~20GB of VRAM. We tried inference on the RTX 3090 and it worked well.

cesinsingapore

12 days ago

RTX 3070 TI laptop with 8gigs is not working while running the python -m omni_speech.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path Llama-3.1-8B-Omni --model-name Llama-3.1-8B-Omni --s2s

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment