which model i need to use in my single 3090 and 32gb RAM

#2
by cemo702 - opened

which model i need to use in my single 3090 and 32gb RAM

Arli AI org

I would run GPTQ Q4 or GGUF Q5.

With Q5_K_L.GGUF I can get all 59 layers in 24g VRAM with 16k context size on a 4090.

Arli AI org

Nice! Thanks for the report.

Sign up or log in to comment