4.5bpw?

#1
by ThenMagician - opened

Can you release 4.5bpw versions? I have a RTX 3060 with 12gb and running at 4.5 takes 11.6~ GB of VRAM while reaching 15 t/s at 4k context.
I don't know if headbit 8 increases VRAM costs, but I tried some 4bpw h8 models and the improvement is quite noticeable, so maybe a 4.5bpw h8 version too?

Sign up or log in to comment