prithivida
/

miniDense_chinese_v1

Sentence Similarity

sentence-transformers

feature-extraction

passage-retrieval

knowledge-distillation

middle-training

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions

prithivida commited on Jun 5

Commit

123b3b5

•

1 Parent(s): 2f80623

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -158,7 +158,7 @@ for query, query_embedding in zip(queries, query_embeddings):
 # FAQs:
 #### How can I reduce overall inference cost ?
-- You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashRetrieve](https://github.com/PrithivirajDamodaran/FlashRetrieve) library.
 #### How do I reduce vector storage cost ?
 [Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)

 # FAQs:
 #### How can I reduce overall inference cost ?
+- You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashEmbed](https://github.com/PrithivirajDamodaran/flashembed) library.
 #### How do I reduce vector storage cost ?
 [Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)