view post Post Reply Anchor Large Language Models: Up to 99% KV cache reduction!paper: https://arxiv.org/pdf/2402.07616.pdf