[Cache Request] Qwen/Qwen2-72B-Instruct-GPTQ-Int8

#143
by lukasc-ch - opened

Please add the following model (or any other Qwen2-72b-instruct version) to the neuron cache, to allow long context (128k token) experiments on the SoA model for this.

Sign up or log in to comment