Safetensors
llama

How to inference the model?

#3
by frankgu3528 - opened

Can we directly using vLLM to do the inference for this model?

Hi!

Our model uses exactly the same architecture as Llama-3 so technically you should be able to use vLLM just like Llama-3 (though we haven't tested it and not sure if vLLM will affect the precision in long-context applications).

Sign up or log in to comment