Quantize only LLM / Leave Vision Tower Untouched

#2
by Jotschi - opened

Have you tried to quantize only the LLM part and leave the 400M vision tower untouched? I'm curious whether this would improve the quality of the output.
I think it is possible to iterate the tensors using model.named_parameters() selectively quantize the layers by skipping all "vision_tower" params.

Jotschi changed discussion status to closed

Sign up or log in to comment