nvidia-docker

项目中的描述信息中提到: Build and run Docker containers leveraging NVIDIA GPUs, 为了更好地提供一套基于nvidia芯片的GPU服务,则创建的一个开源项目命令集合。

项目地址: https://github.com/NVIDIA/nvidia-docker

docker vs nvidia-docker

docker是硬件无关和平台无关的,但是在使用 nvidia GPU的时候,需要依赖nvidia-driver,此时就想到将nvidia driver安装到Container内,但是Container内的nvidia driver版本要和host主机的版本一样,所以这样就不符合docker的平台无关硬件无关的特性了。
现在的nvidia-docker的解决方案是在image里不安装nvidia driver,而是在启动container时通过挂载driver文件或者指定特定的硬件的方式来启动container。

安装前置

distribution= (./etc/osrelease;echo ( . / e t c / o s − r e l e a s e ; e c h o ID$VERSION_ID)

设置环境变量;在CentOS 7中,distribution的值为:centos7

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \
sudo tee /etc/yum.repos.d/nvidia-docker.repo

程序执行的结果输出为:

[libnvidia-container]
name=libnvidia-container
baseurl=https://nvidia.github.io/libnvidia-container/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/libnvidia-container/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

[nvidia-container-runtime]
name=nvidia-container-runtime
baseurl=https://nvidia.github.io/nvidia-container-runtime/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/nvidia-container-runtime/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

[nvidia-docker]
name=nvidia-docker
baseurl=https://nvidia.github.io/nvidia-docker/centos7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/nvidia-docker/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

安装指令

yum search nvidia-docker

命令执行输出结果为:

Loaded plugins: fastestmirror, langpacks
libnvidia-container/x86_64/signature                                                                                                                                                                                |  455 B  00:00:00     
Retrieving key from https://nvidia.github.io/libnvidia-container/gpgkey
Importing GPG key 0xF796ECB0:
 Userid     : "NVIDIA CORPORATION (Open Source Projects) <cudatools@nvidia.com>"
 Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0
 From       : https://nvidia.github.io/libnvidia-container/gpgkey
Is this ok [y/N]: y
libnvidia-container/x86_64/signature                                                                                                                                                                                | 2.0 kB  00:00:07 !!! 
nvidia-container-runtime/x86_64/signature                                                                                                                                                                           |  455 B  00:00:00     
Retrieving key from https://nvidia.github.io/nvidia-container-runtime/gpgkey
Importing GPG key 0xF796ECB0:
 Userid     : "NVIDIA CORPORATION (Open Source Projects) <cudatools@nvidia.com>"
 Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0
 From       : https://nvidia.github.io/nvidia-container-runtime/gpgkey
Is this ok [y/N]: y
nvidia-container-runtime/x86_64/signature                                                                                                                                                                           | 2.0 kB  00:00:02 !!! 
nvidia-docker/x86_64/signature                                                                                                                                                                                      |  455 B  00:00:00     
Retrieving key from https://nvidia.github.io/nvidia-docker/gpgkey
Importing GPG key 0xF796ECB0:
 Userid     : "NVIDIA CORPORATION (Open Source Projects) <cudatools@nvidia.com>"
 Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0
 From       : https://nvidia.github.io/nvidia-docker/gpgkey
Is this ok [y/N]: y
nvidia-docker/x86_64/signature                                                                                                                                                                                      | 2.0 kB  00:00:04 !!! 
(1/3): libnvidia-container/x86_64/primary                                                                                                                                                                           | 3.5 kB  00:00:01     
(2/3): nvidia-docker/x86_64/primary                                                                                                                                                                                 | 4.0 kB  00:00:02     
(3/3): nvidia-container-runtime/x86_64/primary                                                                                                                                                                      | 4.3 kB  00:00:08     
Loading mirror speeds from cached hostfile
 * base: mirrors.tuna.tsinghua.edu.cn
 * epel: mirrors.tuna.tsinghua.edu.cn
 * extras: mirrors.tuna.tsinghua.edu.cn
 * updates: mirrors.aliyun.com
libnvidia-container                                                                                                                                                                                                                  20/20
nvidia-container-runtime                                                                                                                                                                                                             29/29
nvidia-docker                                                                                                                                                                                                                        29/29
======================================================================================================= N/S matched: nvidia-docker ========================================================================================================
nvidia-docker2.noarch : nvidia-docker CLI wrapper
nvidia-docker.x86_64 : NVIDIA Docker container tools

  Name and summary matches only, use "search all" for everything.

yum install nvidia-docker

启动服务

systemctl start nvidia-docker
systemctl status nvidia-docker

如果未启动nvidia-docker服务,则容易产生如下问题:

docker: Error response from daemon: create nvidia_driver_390.30: create nvidia_driver_390.30: Error looking up volume plugin nvidia-docker: legacy plugin: plugin not found.

总结

参考信息

【1】 安装指南 https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)
【2】安装前置说明 https://nvidia.github.io/nvidia-docker/

Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐