Felix Marty commited on
Commit
b93cbc9
1 Parent(s): 1fbf25f

better readme

Browse files
Files changed (1) hide show
  1. README.md +78 -3
README.md CHANGED
@@ -1,8 +1,13 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
4
 
5
- This model is a fork of (facebook/levit-256)[https://huggingface.co/facebook/levit-256], where:
6
 
7
  * `nn.BatchNorm2d` and `nn.Conv2d` are fused
8
  * `nn.BatchNorm1d` and `nn.Linear` are fused
@@ -15,12 +20,82 @@ and the optimized model is converted to the onnx format.
15
  from optimum.onnxruntime.modeling_ort import ORTModelForImageClassification
16
  from transformers import AutoModelForImageClassification
17
 
 
 
 
18
  ort_model = ORTModelForImageClassification.from_pretrained("fxmarty/levit-256-onnx")
19
 
20
- inp = {"pixel_values": torch.rand(4, 3, 224, 224)}
21
 
22
- res = model(**inp)
 
23
  res_ort = ort_model(**inp)
24
 
25
  assert torch.allclose(res.logits, res_ort.logits, atol=1e-4)
26
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - vision
5
+ - image-classification
6
+ datasets:
7
+ - imagenet-1k
8
  ---
9
 
10
+ This model is a fork of [facebook/levit-256](https://huggingface.co/facebook/levit-256), where:
11
 
12
  * `nn.BatchNorm2d` and `nn.Conv2d` are fused
13
  * `nn.BatchNorm1d` and `nn.Linear` are fused
 
20
  from optimum.onnxruntime.modeling_ort import ORTModelForImageClassification
21
  from transformers import AutoModelForImageClassification
22
 
23
+ pt_model = AutoModelForImageClassification.from_pretrained("facebook/levit-256")
24
+ pt_model.eval()
25
+
26
  ort_model = ORTModelForImageClassification.from_pretrained("fxmarty/levit-256-onnx")
27
 
28
+ inp = {"pixel_values": torch.rand(1, 3, 224, 224)}
29
 
30
+ with torch.no_grad():
31
+ res = pt_model(**inp)
32
  res_ort = ort_model(**inp)
33
 
34
  assert torch.allclose(res.logits, res_ort.logits, atol=1e-4)
35
  ```
36
+
37
+ ## Benchmarking
38
+
39
+ More than x2 throughput with batch normalization folding and onnxruntime 🔥
40
+
41
+ ```
42
+ PyTorch runtime:
43
+
44
+ {'latency_50': 22.3024695,
45
+ 'latency_90': 23.1230725,
46
+ 'latency_95': 23.2653985,
47
+ 'latency_99': 23.60095705,
48
+ 'latency_999': 23.865580469999998,
49
+ 'latency_mean': 22.442956878923766,
50
+ 'latency_std': 0.46544295612971265,
51
+ 'nb_forwards': 446,
52
+ 'throughput': 44.6}
53
+
54
+ Optimum-onnxruntime runtime:
55
+
56
+ {'latency_50': 9.302445,
57
+ 'latency_90': 9.782875,
58
+ 'latency_95': 9.9071944,
59
+ 'latency_99': 11.084606999999997,
60
+ 'latency_999': 12.035858692000001,
61
+ 'latency_mean': 9.357703552853133,
62
+ 'latency_std': 0.4018553286992142,
63
+ 'nb_forwards': 1069,
64
+ 'throughput': 106.9}
65
+
66
+ ```
67
+
68
+ ```python
69
+ from optimum.runs_base import TimeBenchmark
70
+
71
+ from pprint import pprint
72
+
73
+ time_benchmark_ort = TimeBenchmark(
74
+ model=ort_model,
75
+ batch_size=1,
76
+ input_length=224,
77
+ model_input_names={"pixel_values"},
78
+ warmup_runs=10,
79
+ duration=10
80
+ )
81
+
82
+ results_ort = time_benchmark_ort.execute()
83
+
84
+ with torch.no_grad():
85
+ time_benchmark_pt = TimeBenchmark(
86
+ model=pt_model,
87
+ batch_size=1,
88
+ input_length=224,
89
+ model_input_names={"pixel_values"},
90
+ warmup_runs=10,
91
+ duration=10
92
+ )
93
+
94
+ results_pt = time_benchmark_pt.execute()
95
+
96
+ print("PyTorch runtime:\n")
97
+ pprint(results_pt)
98
+
99
+ print("\nOptimum-onnxruntime runtime:\n")
100
+ pprint(results_ort)
101
+ ```