|
## Typesense Public Embedding Models |
|
We store our current supported embedding models in this repo and you can also convert your own models to ONNX format and create a PR to add it to our supported models list. |
|
|
|
### Convert a model to ONNX format |
|
|
|
#### Converting a Hugging Face Transformers Model |
|
You can follow instructions from [this link](https://huggingface.co/docs/transformers/serialization#export-to-onnx) to convert any model from Hugging Face to ONNX format using ```optimum-cli```. |
|
#### Converting a PyTorch Model |
|
You can use ```torch.onnx``` [APIs](https://pytorch.org/docs/stable/onnx.html) to convert PyTorch models to ONNX. |
|
#### Converting a Tensorflow Model |
|
You can use ```tf2onnx``` [tool](https://onnxruntime.ai/docs/tutorials/tf-get-started.html#getting-started-converting-tensorflow-to-onnx) to convert Tensorflow models to ONNX. |
|
|
|
### Creating model config |
|
Before creating a PR with your ONNX model, you should store model file, vocab file and model config file under a folder with model name. Your model config must be named as ```config.json``` and should contain those keys: |
|
| Key | Description | Optional | |
|
|-----|-------------|----------| |
|
|model_md5| MD5 checksum of model file as string| No | |
|
|vocab_md5| MD5 checksum of vocab file as string| No | |
|
|model_type| Model type (currently only ```bert``` and ```xlm_roberta``` supported)| No | |
|
|vocab_file_name| File name of vocab file| No | |
|
|indexing_prefix| Prefix to be added before embedding documents| Yes | |
|
|query_prefix| Prefix to be added before embedding queries | Yes | |
|
|