Fix for onnxruntime-gpu slow first run

#5
by ehrrh - opened

With this model if you set providers=["CUDAExecutionProvider", "CPUExecutionProvider"] to use the gpu with onnx it spends a long time using the GPU for something when you try to do the first inference before finally quickly tagging everything, happens with both onnxruntime-gpu 1.18.1 and 1.19.0; if the number of images you want to tag is small it ends up being faster to use CPU because of the long wait time for the first image on GPU, why is that?

ehrrh changed discussion status to closed
ehrrh changed discussion status to open

Ok, found the actual fix: https://github.com/microsoft/onnxruntime/issues/19838

cuda_options = {
    "cudnn_conv_algo_search": "HEURISTIC"  # Set the cuDNN convolution algorithm search to HEURISTIC
}

model = rt.InferenceSession(model_path, providers=[('CUDAExecutionProvider', cuda_options), 'CPUExecutionProvider'])
ehrrh changed discussion title from Bug with onnxruntime-gpu to Fix onnxruntime-gpu slow first run
ehrrh changed discussion title from Fix onnxruntime-gpu slow first run to Fix for onnxruntime-gpu slow first run

Keeping this open for awareness. I'm assuming you reported this upstream too?

SmilingWolf changed discussion status to closed
SmilingWolf changed discussion status to open

Yeah, it seems to be a known problem https://github.com/microsoft/onnxruntime/issues/10746

Sign up or log in to comment