Text model

#1
by xentnex - opened

Is it possible to change Llama with Qwen, Mistral, etc.. for ultravox text part ?
Is it possible to quantize text model for faster inferencing ? 4bit, 5bit, 8bit ?

Fixie.ai org

While we primarily use Llama for development, Ultravox is designed to work with most LLMs in Hugging Face. It is possible that some parts of the code may break, and we welcome comments or, even better, direct contributions to the GitHub repository to address these issues.

As for quantization, it is not supported yet.

Fixie.ai org

Exactly. Just to clarify though, using other models requires retraining the adapter for the downstream model. The code and configs are available at https://ultravox.ai/

Sign up or log in to comment