Docker安装步骤

docker的安装可以参考官方的文档:https://docs.docker.com/engine/install/ubuntu/
其中第三步的时候注意选择arm的库。
在这里插入图片描述

nvidia-container-toolkit安装

nvidia-container-toolkit的安装,可以直接使用apt-get。当出现packet无法解析的时候,是没有添加相应的源。jetson本机上已经有了相应的源,只是被注释掉了,路径是 /etc/apt/sources.list.d/nvidia-l4t-apt-source.list
在这里插入图片描述
执行命令

sudo apt-get update
sudo apt-get install nvidia-container-toolkit

然后启动docker。

运行程序出现如下错误。

Error: Can't initialize nvrm channel
Error: Can't initialize nvrm channel
Couldn't create ddkvic Session: Cannot allocate memory
nvbuf_utils: Could not create Default NvBufferSession

docker 安装出现错误

1月 08 10:42:07 znv-desktop systemd[1]: Dependency failed for Docker Application Container Engine.
1月 08 10:42:07 znv-desktop systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
1月 08 10:45:05 znv-desktop systemd[1]: Dependency failed for Docker Application Container Engine.
1月 08 10:45:05 znv-desktop systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
1月 08 10:47:07 znv-desktop systemd[1]: Dependency failed for Docker Application Container Engine.
1月 08 10:47:07 znv-desktop systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
1月 08 10:50:00 znv-desktop systemd[1]: Dependency failed for Docker Application Container Engine.
1月 08 10:50:00 znv-desktop systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
1月 08 10:52:30 znv-desktop systemd[1]: Dependency failed for Docker Application Container Engine.
1月 08 10:52:30 znv-desktop systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.

似乎是启动docker的service出现的错误,运行命令systemctl start docker.service,出现如下报错:

1月 08 10:56:23 znv-desktop kernel: Extcon AUX1(HDMI) disable
1月 08 10:56:23 znv-desktop kernel: tegra_nvdisp_handle_pd_disable: Powergated Head1 pd
1月 08 10:56:23 znv-desktop kernel: tegra_nvdisp_handle_pd_disable: Powergated Head0 pd
1月 08 10:56:59 znv-desktop systemd[1]: docker.socket: Socket service docker.service already active, refusing.
1月 08 10:56:59 znv-desktop systemd[1]: Failed to listen on Docker Socket for the API.

发现是docker的服务已经存在,是因为重新安装docker之前,没有停止服务,先关掉docker的service,然后重新安装,就ok了。

启动docker

一开始使用的是docker官方提供的ubuntu原始镜像,但是代码调用显卡资源失败,意识到还是需要一些jetson硬件的驱动,就寻找NVIDIA官方的镜像,就省去很多麻烦了呀。

https://ngc.nvidia.com/catalog/containers/
搜索jetson,选择需要的镜像,我下载的是Base版本。
创建容器后依然无法调用硬件资源,发现需要添加runtime nvidia的支持。

 apt-get install nvidia-container-runtime
systemctl edit docker.service

# 会打开一个文件,输入一下内容
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --host=fd:// --add-runtime=nvidia=/usr/bin/nvidia-container-runtime

# 重启docker服务
sudo systemctl daemon-reload
sudo systemctl restart docker

再修改了docker服务选项后,启动docker服务时出现错误,

$ systemctl start docker.service
Job for docker.service failed because the control process exited with error code.
See "systemctl status docker.service" and "journalctl -xe" for details.


$systemctl status docker.service
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Thu 2021-01-21 09:57:07 CST; 5s ago
     Docs: https://docs.docker.com
  Process: 7628 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=1/FAILURE)
 Main PID: 7628 (code=exited, status=1/FAILURE)

1月 21 09:57:07 xxx-desktop systemd[1]: docker.service: Service hold-off time over, scheduling restart.
1月 21 09:57:07 xxx-desktop systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
1月 21 09:57:07 xxx-desktop systemd[1]: Stopped Docker Application Container Engine.
1月 21 09:57:07 xxx-desktop systemd[1]: docker.service: Start request repeated too quickly.
1月 21 09:57:07 xxx-desktop systemd[1]: docker.service: Failed with result 'exit-code'.
1月 21 09:57:07 xxx-desktop systemd[1]: Failed to start Docker Application Container Engine.

搞了一天的时间,尝试了各种方法,结果一个重启就好了。真真气煞老夫了。

# 查看拉去到的image
$ docker images
REPOSITORY                     TAG                 IMAGE ID            CREATED             SIZE
nvcr.io/nvidia/l4t-base        r32.4.4             10faffedd5fa        3 months ago        634MB
# 创建docker容器
docker run -it -d --runtime nvidia --privileged --name=jetson_alg -p 127.0.0.1:9010:9010 -v /path/to/host/:/data/ <jetson' image ID>

再度进入docker内,算法代码总算正常跑起来了。

Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐