Failed to set execution provider: OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library
I have CUDA 13:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Aug_20_13:58:20_Pacific_Daylight_Time_2025
Cuda compilation tools, release 13.0, V13.0.88
Build cuda_13.0.r13.0/compiler.36424714_0
...and python 3.12, RTX 4050 with 6GB VRAM and i5-13450HX CPU.
Downloaded nexa-cli_windows_x86_64_cuda.exe and ran this as admin in CMD ( I selected Q4 quantization):
nexa infer NexaAI/DeepSeek-OCR-GGUF
Migration completed.
model not found, start download
downloading 100% |ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| (2.5/2.5 GB, 24 MB/s) [1m39s:0s]
β Download success!
β οΈ Oops. Model failed to load.
π Try these:
- Verify your system meets the model's requirements.
- Seek help in our discord or slack.
And when deleting model files and redownloading:
nexa infer NexaAI/DeepSeek-OCR-GGUF-CUDA
model not found, start download
downloading 100% |ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| (2.5/2.5 GB, 22 MB/s) [1m46s:0s]
β Download success!
2025-12-05 03:48:48.9980443 [E:onnxruntime:nexaml-ort-model, provider_bridge_ort.cc:1938 onnxruntime::CudaProviderFactoryCreator::Create] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1778 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : Error loading "C:\Users\User\AppData\Local\Nexa CLI\nexa_cuda\onnxruntime_providers_cuda.dll" which depends on "cublasLt64_12.dll" which is missing. (Error 126: "The specified module could not be found.")
Failed to set execution provider: OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library
Falling back to CPU execution provider
Send a message, press /? for help