[Cache Request] Qwen/Qwen2-72B-Instruct-GPTQ-Int8

#143

by lukasc-ch - opened Jul 18

Jul 18

Please add the following model (or any other Qwen2-72b-instruct version) to the neuron cache, to allow long context (128k token) experiments on the SoA model for this.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment