Docker 安装 TensorFlow GPU 实战
安装背景AI如雨后春笋般的出现,DEVOPS的理论不断深入。所有高大上的开源产品都支持两个环境:docker 和Linux。本文主要讲解怎么在一台安装了GPU的centos7 环境安装tensorflow docker镜像。国内就几个大厂的同学可以享受这种高级环境待遇,如果您有该环境建议尝试起来吧,毕竟AI可以让我们多一项skill。安装nvidia-dockernvidia 对docker进行了
安装背景
AI如雨后春笋般的出现,DEVOPS的理论不断深入。所有高大上的开源产品都支持两个环境:docker 和Linux。本文主要讲解怎么在一台安装了GPU的centos7 环境安装tensorflow docker镜像。国内就几个大厂的同学可以享受这种高级环境待遇,如果您有该环境建议尝试起来吧,毕竟AI可以让我们多一项skill。
安装nvidia-docker
nvidia 对docker进行了一层封装,可以支持nivdia 的cpu。
具体的安装过程可以参考:
https://github.com/NVIDIA/nvidia-docker?utm_source=tuicool&utm_medium=referral
安装玩以后使用nvidia配置的命令:
[root@~]# nvidia-
nvidia-bug-report.sh nvidia-debugdump nvidia-installer nvidia-settings nvidia-xconfig
nvidia-cuda-mps-control nvidia-docker nvidia-modprobe nvidia-smi
nvidia-cuda-mps-server nvidia-docker-plugin nvidia-persistenced nvidia-uninstall
如果有下面的错误,说明没有启动相关服务:
[root@ourui]# nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:latest-gpu
docker: Error response from daemon: create nvidia_driver_367.48: create nvidia_driver_367.48: Error looking up volume plugin nvidia-docker: legacy plugin: plugin not found.
See 'docker run --help'.
使用下面命令查看nvidia-docker 是否启动
root@ourui]# systemctl status nvidia-docker
● nvidia-docker.service - NVIDIA Docker plugin
Loaded: loaded (/usr/lib/systemd/system/nvidia-docker.service; disabled; vendor preset: disabled)
Active: inactive (dead)
Docs: https://github.com/NVIDIA/nvidia-docker/wiki
[root@ourui]# systemctl start nvidia-docker
[root@ourui]# systemctl status nvidia-docker
● nvidia-docker.service - NVIDIA Docker plugin
Loaded: loaded (/usr/lib/systemd/system/nvidia-docker.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2017-03-27 10:39:16 CST; 2s ago
Docs: https://github.com/NVIDIA/nvidia-docker/wiki
Process: 51649 ExecStartPost=/bin/sh -c /bin/echo unix://$SOCK_DIR/nvidia-docker.sock > $SPEC_FILE (code=exited, status=0/SUCCESS)
Process: 51644 ExecStartPost=/bin/sh -c /bin/mkdir -p $( dirname $SPEC_FILE ) (code=exited, status=0/SUCCESS)
Main PID: 51643 (nvidia-docker-p)
Memory: 13.9M
CGroup: /system.slice/nvidia-docker.service
└─51643 /usr/bin/nvidia-docker-plugin -s /var/lib/nvidia-docker
Mar 27 10:39:16 ctum2e1302005.idc.wanda-group.net systemd[1]: Starting NVIDIA Docker plugin...
Mar 27 10:39:16 ctum2e1302005.idc.wanda-group.net systemd[1]: Started NVIDIA Docker plugin.
这一步就把基本的nvidia docker 环境安装好。需要注意,nvidia没有提供最新发布docker的版本,如果需要测试最新的docker release版本需要使用别的方法。
下载docker images
tensorflow 社区在docker hub 提供了一套images下载地址:
https://hub.docker.com/r/tensorflow/tensorflow/
由于我们都知道的原因,国内有时候下载docker hub的images 都是问题。我让我想起了一句话:这是一个最好的时代、也是一个最坏的时代。为了自己的房贷,想办法吧!
国内很多docker hub ,当然可以直接使用国内的docker hub,同时也提供了一些加速器,所谓加速,你们明白的。下面我们看看使用阿里云加速器:
https://yq.aliyun.com/articles/29941
设置好了过后就可以直接下载docker iamges 了
nvidia-docker pull tensorflow/tensorflow:latest-gpu
启动container
[root@ourui]# nvidia-docker run -it -d -p 8888:8888 tensorflow/tensorflow:latest-gpu
69fede4460082f3e4aa847fc34ac0f58e797dc44b10d65643a70d2a1e7e4ba03
[root@ourui]#
[root@ourui]# nvidia-docker logs 69fede4460082f3e4aa847fc34ac0f58e797dc44b10d65643a70d2a1e7e4ba03
[I 02:45:08.016 NotebookApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret
[W 02:45:08.031 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[I 02:45:08.037 NotebookApp] Serving notebooks from local directory: /notebooks
[I 02:45:08.037 NotebookApp] 0 active kernels
[I 02:45:08.037 NotebookApp] The Jupyter Notebook is running at: http://[all ip addresses on your system]:8888/?token=f1d1717e2fdbf8c1807f5017315396be05a6b95310d87cb9
[I 02:45:08.038 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 02:45:08.038 NotebookApp]
Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://localhost:8888/?token=f1d1717e2fdbf8c1807f5017315396be05a6b95310d87cb9
测试
打开web:
http://ip:8888/?token=f1d1717e2fdbf8c1807f5017315396be05a6b95310d87cb9
更多推荐
所有评论(0)