前言
在此记录一下docker的镜像和容器的相关注意事项
前提条件:已安装Docker、显卡驱动等基础配置
1. 安装镜像
网上有太多的教程,但是都没说如何下载官方的镜像,在这里记录一下,使用docker安装官方的镜像
Docker Hub的官方链接:https://www.docker.com/products/docker-hub/
点击Explore Docker Hub,在搜索框中输入:nvidia/cuda,转到tags,找到合适的镜像,复制链接即可
Docker镜像源
docker.chenby.cn/
为了更加快速的下载,一般会添加docker镜像源,提高下载速度,如果不适用docker镜像源,也可能下载失败,因此,完整的镜像下载命令如下:
docker pull docker.chenby.cn/nvidia/cuda:11.1.1-cudnn8-devel-ubuntu20.04
等待下载完毕即可,这个命令使用的cuda版本不高,应该可以在大部分机器上直接使用
-
镜像重命名
docker tag 旧镜像名 新镜像名
docker rmi 旧镜像名
使用docker tag 其实会生成一个新镜像,我们可以使用docker rmi 删除旧的镜像
2. NVIDIA Container Toolkit (Docker使用GPU)
- 设置NVIDIA Container Toolkit的stable版本存储库的GPG key:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
- 安装toolkit:
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
3. 创建容器
现在需要进入一个空的项目,并进入到Dockerfile的文件目录中,在本例中,则需要命令行切换到docker_test目录下,并根据自己的需求,编辑dockerfile
其中,重点关注FROM的镜像源是否一致,详细的dockerfile自行了解(正常情况下,github的项目都是配置好的,只需注意FROM的镜像源),dockerfile示例如下(YOLOv10):
# Ultralytics YOLO 🚀, AGPL-3.0 license
# Builds ultralytics/ultralytics:latest image on DockerHub https://hub.docker.com/r/ultralytics/ultralytics
# Image is CUDA-optimized for YOLOv8 single/multi-GPU training and inference# Start FROM PyTorch image https://hub.docker.com/r/pytorch/pytorch or nvcr.io/nvidia/pytorch:23.03-py3
FROM pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
RUN pip install --no-cache nvidia-tensorrt --index-url https://pypi.ngc.nvidia.com# Downloads to user config dir
ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.ttf \https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.Unicode.ttf \/root/.config/Ultralytics/# Install linux packages
# g++ required to build 'tflite_support' and 'lap' packages, libusb-1.0-0 required for 'tflite_support' package
RUN apt update \&& apt install --no-install-recommends -y gcc git zip curl htop libgl1 libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0# Security updates
# https://security.snyk.io/vuln/SNYK-UBUNTU1804-OPENSSL-3314796
RUN apt upgrade --no-install-recommends -y openssl tar# Create working directory
WORKDIR /usr/src/ultralytics# Copy contents
# COPY . /usr/src/ultralytics # git permission issues inside container
RUN git clone https://github.com/ultralytics/ultralytics -b main /usr/src/ultralytics
ADD https://github.com/ultralytics/assets/releases/download/v8.1.0/yolov8n.pt /usr/src/ultralytics/# Install pip packages
RUN python3 -m pip install --upgrade pip wheel
RUN pip install --no-cache -e ".[export]" albumentations comet pycocotools# Run exports to AutoInstall packages
# Edge TPU export fails the first time so is run twice here
RUN yolo export model=tmp/yolov8n.pt format=edgetpu imgsz=32 || yolo export model=tmp/yolov8n.pt format=edgetpu imgsz=32
RUN yolo export model=tmp/yolov8n.pt format=ncnn imgsz=32
# Requires <= Python 3.10, bug with paddlepaddle==2.5.0 https://github.com/PaddlePaddle/X2Paddle/issues/991
RUN pip install --no-cache paddlepaddle>=2.6.0 x2paddle
# Fix error: `np.bool` was a deprecated alias for the builtin `bool` segmentation error in Tests
RUN pip install --no-cache numpy==1.23.5
# Remove exported models
RUN rm -rf tmp# Set environment variables
ENV OMP_NUM_THREADS=1
# Avoid DDP error "MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library" https://github.com/pytorch/pytorch/issues/37377
ENV MKL_THREADING_LAYER=GNU# Usage Examples -------------------------------------------------------------------------------------------------------# Build and Push
# t=ultralytics/ultralytics:latest && sudo docker build -f docker/Dockerfile -t $t . && sudo docker push $t# Pull and Run with access to all GPUs
# t=ultralytics/ultralytics:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all $t# Pull and Run with access to GPUs 2 and 3 (inside container CUDA devices will appear as 0 and 1)
# t=ultralytics/ultralytics:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus '"device=2,3"' $t# Pull and Run with local directory access
# t=ultralytics/ultralytics:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all -v "$(pwd)"/datasets:/usr/src/datasets $t# Kill all
# sudo docker kill $(sudo docker ps -q)# Kill all image-based
# sudo docker kill $(sudo docker ps -qa --filter ancestor=ultralytics/ultralytics:latest)# DockerHub tag update
# t=ultralytics/ultralytics:latest tnew=ultralytics/ultralytics:v6.2 && sudo docker pull $t && sudo docker tag $t $tnew && sudo docker push $tnew# Clean up
# sudo docker system prune -a --volumes# Update Ubuntu drivers
# https://www.maketecheasier.com/install-nvidia-drivers-ubuntu/# DDP test
# python -m torch.distributed.run --nproc_per_node 2 --master_port 1 train.py --epochs 3# GCP VM from Image
# docker.io/ultralytics/ultralytics:latest
-
创建容器
将docker的ssh端口22映射到物理机的2222
将docekr的 /usr/src/ultralytics 目录映射到物理机的 /local/path 目录
-name : 容器的名字,可以自定义
docker_images_id:镜像的名字,根据自己生成的镜像来改
sudo docker run --gpus all -it -p 2222:22 --name container_name -v /local/path:/usr/src/ultralytics docker_images_id:latest
这样,我们就建立好了docker images,同时创建了一个docker container,并将本地与docker建立了联系,我们就可以进入docker container内部,进行开发了
参考
vscode+docker搭建迷你开发环境。制作docker镜像,并通过vscode连接后进行开发
通过安装NVIDIA Container Toolkit在Docker中使用GPU