Linux服务器配置onnxruntime-gpu

本文实现 onnxruntime-gpu 不依赖于服务器主机上 cuda 和 cudnn，仅使用虚拟环境中的 cuda 依赖包实现 onnx GPU 推理加速的安装教程。为了适配推理节点，因此我们仅在 base 下配置环境，不需要重新创建新的虚拟环境。

升级 pip

pip install --upgrade pip

安装 `Pytorch`

首先需要查看系统可安装的 cuda 版本

# nvidia-smi
Thu Jan 16 01:04:13 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.12             Driver Version: 535.104.12   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A800 80GB PCIe          On  | 00000000:38:00.0 Off |                    0 |
| N/A   46C    P0              71W / 300W |    435MiB / 81920MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------++---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

从上述结果可以看到，我们可以安装 CUDA12.2 以下版本的 GPU 版本的 Torch，根据官网 Previous PyTorch Versions | PyTorch 可以查看安装命令：

conda install pytorch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 pytorch-cuda=12.1 -c pytorch -c nvidia

执行以后键入 y 即可，这里我们选择 CUDA12.1 版本的 Torch2.5.0 版本即可，安装完建议再执行一次上述命令，因为有时候会因为网络原因，导致部分依赖包并未安装完整，因此，我们建议再执行一次。

测试安装是否成功安装 GPU 版本

python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0), torch.version.cuda)"

安装 `onnxruntime-gpu`

查看

随便指定一个比较大的版本，即可查看可以安装的 onnxruntime-gpu 版本

pip install onnxruntime-gpu==1.88

输出即为所有可安装的版本号

from versions: 1.11.0, 1.11.1, 1.12.0, 1.12.1, 1.13.1, 1.14.0, 1.14.1, 1.15.0, 1.15.1, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.17.0, 1.17.1, 1.18.0, 1.18.1, 1.19.0, 1.19.2)

卸载已经安装的版本

卸载已经安装的 onnxruntime-gpu 和 onnxruntime

pip uninstall onnxruntime-gpu onnxruntime

查看 `libcublasLt.so.` 的版本

此外，我们还需要查看已经安装 libcublasLt.so. 的版本，因为不同的 libcublasLt.so. 所支持的 onnxruntime-gpu 是不一样的，可参考下表对应：

`libcublasLt.so.`	`onnxruntime-gpu`
11	1.18.×、1.17.×
12	1.19.×

为了查看 libcublasLt.so. 的版本，我们还需要安装 mlocate 依赖包

sudo apt-get update
sudo apt-get install mlocate

通过打印 libcublasLt.so. 的目录来查看已经安装的版本

updatedb
locate libcublasLt.so.11          
locate libcublasLt.so.12

输出的地址即为安装目录，那么按照上述表格内容安装即可。

再根据自己的 CUDA 版本在官网NVIDIA - CUDA | onnxruntime 中找到自己所对应的 onnxruntime-gpu 版本。

在这里插入图片描述

卸载原来的 onnxruntime-gpu，并安装新的 onnxruntime-gpu 版本。根据我的 CUDA12.1 版本所对应，安装 1.19.0 版本的 onnxruntime-gpu。

问题概述

numpy 版本问题

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with ‘pybind11>=2.12’.

If you are a user of the module, the easiest solution will be to
downgrade to ‘numpy<2’ or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
pip uninstall numpy
安装
pip install numpy==1.24.1 -i https://pypi.tuna.tsinghua.edu.cn/simple

libcublasLt.so 环境变量问题

2025-01-16 05:52:20.219748146 [E:onnxruntime:Default, provider_bridge_ort.cc:1992 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1637 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.12: cannot open shared object file: No such file or directory

2025-01-16 05:52:20.220658808 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:965 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9.* and CUDA 12.*. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they’re in the PATH, and that your GPU is supported.
=出现这种问题，我们首先需要查看 onnxruntime-gpu 版本是否安装正确，其实大部分原因都是因为没有安装与 cuda 的适配版本，如果排除版本问题，那大概率是环境变量的问题，可以按照下面方法解决。
解决办法，查看 libcublasLt.so.12 地址
apt-get install sudo
sudo find / -name libcublasLt.so.12
配置环境变量
vim ~/.bashrc
添加环境变量
export LD_LIBRARY_PATH=/opt/conda/lib:$LD_LIBRARY_PATH
生效环境变量
source ~/.bashrc

libcudnn.so环境变量问题

2025-01-16 04:23:22.326215464 [E:onnxruntime:Default, provider_bridge_ort.cc:1548 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1209 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn.so.8: cannot open shared object file: No such file or directory

2025-01-16 04:23:22.326932000 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:861 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirementsto ensure all dependencies are met.
解决办法，查看 libcudnn.so.8 地址
sudo find / -name libcudnn.so.8
配置环境变量
vim ~/.bashrc
添加环境变量
export LD_LIBRARY_PATH=/opt/conda/lib/python3.9/site-packages/torch/lib:$LD_LIBRARY_PATH
生效环境变量
source ~/.bashrc

总结需要添加的环境变量

export LD_LIBRARY_PATH=/opt/conda/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/conda/lib/python3.9/site-packages/torch/lib:$LD_LIBRARY_PATH

安装 `ultralytics` 依赖包

pip install ultralytics -i https://pypi.tuna.tsinghua.edu.cn/simple

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/news/892880.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！