ubuntu系统安装11.6+版本的cuda
可以参考这两篇博客
ubuntu22.04多版本安装cuda及快速切换(cuda11.1和11.8)_ubuntu调整cuda版本
【Linux】在一台机器上同时安装多个版本的CUDA(切换CUDA版本)_linux安装多个cuda
安装CUDA
https://developer.nvidia.com/cuda-toolkit-archive
找到11.8版本的cuda
依次选择Linux x86_64 Ubuntu 22.04 runfile(local)
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.runsudo sh cuda_11.8.0_520.61.05_linux.run
sudo sh cuda_11.8.0_520.61.05_linux.run
运行上述两个命令
用enter把Driver去掉勾选 然后用方向键移动到最下面的enter然后回车
选择no
出现这样说明cuda安装成功
我们暂时先不去修改环境变量,为了不影响现有的CUDA环境。之后会用脚本进行切换
安装cuDNN
https://developer.nvidia.com/rdp/cudnn-archive
从这里下载cuDNN
我们输入
cd /usr/local/
ls
现在有三个版本的cuda文件
我们将下载的cudnn解压到对应的文件夹下面并且赋予执行权限
sudo cp include/cudnn.h /usr/local/cuda-11.8/include
sudo cp lib/libcudnn* /usr/local/cuda-11.8/lib64
sudo chmod a+r /usr/local/cuda-11.8/include/cudnn.h
sudo chmod a+r /usr/local/cuda-11.8/lib64/libcudnn*
切换CUDA版本
我们使用以下命令在/usr/local/ 目录下新建一个switch-cuda的脚本
sudo vim /usr/local/switch-cuda.sh
把以下代码复制粘贴进去
#!/usr/bin/env bash# Copyright (c) 2018 Patrick Hohenecker## Permission is hereby granted, free of charge, to any person obtaining a copy# of this software and associated documentation files (the "Software"), to deal# in the Software without restriction, including without limitation the rights# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell# copies of the Software, and to permit persons to whom the Software is# furnished to do so, subject to the following conditions:## The above copyright notice and this permission notice shall be included in all# copies or substantial portions of the Software.## THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE# SOFTWARE.# author: Patrick Hohenecker <mail@paho.at># version: 2018.1# date: May 15, 2018set -e# ensure that the script has been sourced rather than just executedif [[ "${BASH_SOURCE[0]}" = "${0}" ]]; thenecho "Please use 'source' to execute switch-cuda.sh!"exit 1fiINSTALL_FOLDER="/usr/local" # the location to look for CUDA installations atTARGET_VERSION=${1} # the target CUDA version to switch to (if provided)# if no version to switch to has been provided, then just print all available CUDA installationsif [[ -z ${TARGET_VERSION} ]]; thenecho "The following CUDA installations have been found (in '${INSTALL_FOLDER}'):"ls -l "${INSTALL_FOLDER}" | egrep -o "cuda-[0-9]+\\.[0-9]+$" | while read -r line; doecho "* ${line}"doneset +ereturn# otherwise, check whether there is an installation of the requested CUDA versionelif [[ ! -d "${INSTALL_FOLDER}/cuda-${TARGET_VERSION}" ]]; thenecho "No installation of CUDA ${TARGET_VERSION} has been found!"set +ereturnfi# the path of the installation to usecuda_path="${INSTALL_FOLDER}/cuda-${TARGET_VERSION}"# filter out those CUDA entries from the PATH that are not needed anymorepath_elements=(${PATH//:/ })new_path="${cuda_path}/bin"for p in "${path_elements[@]}"; doif [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; thennew_path="${new_path}:${p}"fidone# filter out those CUDA entries from the LD_LIBRARY_PATH that are not needed anymoreld_path_elements=(${LD_LIBRARY_PATH//:/ })new_ld_path="${cuda_path}/lib64:${cuda_path}/extras/CUPTI/lib64"for p in "${ld_path_elements[@]}"; doif [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; thennew_ld_path="${new_ld_path}:${p}"fidone# update environment variablesexport CUDA_HOME="${cuda_path}"export CUDA_ROOT="${cuda_path}"export LD_LIBRARY_PATH="${new_ld_path}"export PATH="${new_path}"echo "Switched to CUDA ${TARGET_VERSION}."set +ereturn
按ESC 输入:x 回车保存退出
然后我们再进入/usr/local/
/usr/local/
source switch-cuda.sh
自动显示我们目前已经安装的CUDA版本
然后我们输入以下命令切换到刚才安装的11.8版本
source switch-cuda.sh 11.8
然后我们看到系统提示我们已经切换到了11.8版本的cuda,为了确认,我们再检查一下
输入命令
nvcc -V
显示已经切换了版本
Mamba安装
参考教程
【Mamba安装】99%的人都出错!带你手把手解决selective_scan_cuda冲突问题
环境要求
GitHub - state-spaces/s4: Structured state space sequence models
这个是原始版本的Mamba的环境要求
可以看到,他的要求是
- Python 3.9+
- Pytorch 1.10+
- cuda 11.6+
GitHub - hustvl/Vim: [ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
这个是Vim(Vision Mamba)的部署要求
我们这里以Vim的安装部署要求来,因为Vim依赖于Mamba
- Python 3.10
- Pytorch 2.1.1
- cuda 11.8
新建一个虚拟环境
我们新建一个叫做Vim的虚拟环境
conda create -n Vim python=3.10conda activate Vim
在这里去找pytorch的安装版本
https://pytorch.org/
找到2.1.1版本的pytorch
conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=11.8 -c pytorch -c nvidia
安装packaging
conda install packaging
安装causal-conv1d和mamba-ssm(重点!!!)
首先可以参考一下以前的一些经验和踩坑记录
[最佳实践] conda环境内安装cuda 和 Mamba的安装
运行Mamba项目时无法直接用pip install安装causal_conv1d和mamba_ssm_pip install causal-conv1d
总结一下,常规的方法是直接用pip install
命令如下:
pip install causal_conv1d
python setup.py install
但是有特别大的概率会报各种错误,比如:
Building wheel for causal-conv1d (setup.py) ... error
error: command '/usr/bin/gcc' failed with exit code 1
RuntimeError: Error compiling objects for extension
ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects
一种解决方案是git clone源码然后编译
另外一个办法是下载编译好的whl在本地安装或者直接下载源码编译
Releases · Dao-AILab/causal-conv1d
下载了causal_con1d 的v1.2.0版本的whl
以及
GitHub - state-spaces/mamba: Mamba SSM architecture
下载了mamba-ssm1.2.0的源码
放到虚拟环境的文件夹下,打开mamba-1.2.0文件夹
打开setup.py
修改。
首先把这三行给注释掉
然后再加上这三行
然后是第264行 修改
改成下面的
不用ninja去构建
保存退出
然后安装causal_conv1d
pip install causal_conv1d-1.2.0.post2+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
显示成功了
为了验证是否成功
import causal_conv1d
目测是没有问题的,但是很多博客显示这里还会报错,如果报错,就下载causal_conv1d的源码进行编译
编译的时候
Anaconda3/envs/Vim/lib/python3.10/site-packages/torch/utils下
找到cpp_extension.py 修改
手动加入11.8
修改后
然后我们编译 mamba-ssm
安装命令是
pip install . --no-cache-dir --verbose
报错
显示是cuda版本的问题
在Vim虚拟环境下输入nvcc -V,显示的cuda 版本是11.1
很奇怪
然后在Vim的环境下 cd /usr/local/
source switch-cuda.sh 11.8 就是运行之前写的那个切换cuda版本的脚本
然后再输入nvcc -V,显示的cuda 版本是11.8
然后重新编译,OK了
编译好了