How much VRAM does this need to run locally?

#2
by XeIaso - opened

It should be at least 30 GB of vram, right?

Natural Language Processing Group, Institute of Computing Technology, Chinese Academy of Science org

It needs ~20GB of VRAM. We tried inference on the RTX 3090 and it worked well.

RTX 3070 TI laptop with 8gigs is not working while running the python -m omni_speech.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path Llama-3.1-8B-Omni --model-name Llama-3.1-8B-Omni --s2s

Sign up or log in to comment