juanfkurucz commited on
Commit
596bfe9
1 Parent(s): 2d22c94

Add blog link

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -15,6 +15,8 @@ metrics:
15
 
16
  This is an optimized model using [madlag/bert-large-uncased-wwm-squadv2-x2.63-f82.6-d16-hybrid-v1](https://huggingface.co/madlag/bert-large-uncased-wwm-squadv2-x2.63-f82.6-d16-hybrid-v1) as the base model which was created using the [nn_pruning](https://github.com/huggingface/nn_pruning) python library. This is a pruned model of [madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2](https://huggingface.co/madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2)
17
 
 
 
18
  Our final optimized model weighs **579 MB**, has an inference speed of **18.184 ms** on a Tesla T4 and has a performance of **82.68%** best F1. Below there is a comparison for each base model:
19
 
20
  | Model | Weight | Throughput on Tesla T4 | Best F1 |
 
15
 
16
  This is an optimized model using [madlag/bert-large-uncased-wwm-squadv2-x2.63-f82.6-d16-hybrid-v1](https://huggingface.co/madlag/bert-large-uncased-wwm-squadv2-x2.63-f82.6-d16-hybrid-v1) as the base model which was created using the [nn_pruning](https://github.com/huggingface/nn_pruning) python library. This is a pruned model of [madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2](https://huggingface.co/madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2)
17
 
18
+ Feel free to read our blog about how we optimized this model [(link)](https://tryolabs.com/blog/2022/11/24/transformer-based-model-for-faster-inference)
19
+
20
  Our final optimized model weighs **579 MB**, has an inference speed of **18.184 ms** on a Tesla T4 and has a performance of **82.68%** best F1. Below there is a comparison for each base model:
21
 
22
  | Model | Weight | Throughput on Tesla T4 | Best F1 |