ubuntu18:报错NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.解决
服务器重启后,输入nvidia-smi,报错如下:NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.输入nvcc -V输入如下:k8s@master:~$ ...
·
服务器重启后,输入nvidia-smi
,报错如下:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
输入nvcc -V
输入如下:
k8s@master:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
解决方法:
-
sudo apt-get install dkms
-
ll /usr/src/
查看nvidia版本(最后一行的nvidia-410.48)k8s@master:~$ ll /usr/src/ 总用量 36 drwxr-xr-x 9 root root 4096 Dec 14 06:40 ./ drwxr-xr-x 12 root root 4096 Dec 27 15:46 ../ drwxr-xr-x 27 root root 4096 Feb 26 2019 linux-headers-4.15.0-45/ drwxr-xr-x 8 root root 4096 Feb 26 2019 linux-headers-4.15.0-45-generic/ drwxr-xr-x 27 root root 4096 Apr 3 2019 linux-headers-4.15.0-47/ drwxr-xr-x 8 root root 4096 Apr 3 2019 linux-headers-4.15.0-47-generic/ drwxr-xr-x 25 root root 4096 Dec 13 06:15 linux-headers-4.15.0-72/ drwxr-xr-x 8 root root 4096 Dec 13 06:15 linux-headers-4.15.0-72-generic/ drwxr-xr-x 7 root root 4096 Feb 26 2019 nvidia-410.48/
-
sudo dkms install -m nvidia -v 410.48
(-v后面的参数根据自己的nvidia的版本决定) -
到此,该问题已解决输入
nvidia-smi
即可得到如下输出:
k8s@master:~$ nvidia-smi
Sun Jan 5 21:10:18 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:0B:00.0 Off | 0 |
| N/A 33C P8 26W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 00000000:0C:00.0 Off | 0 |
| N/A 25C P8 30W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla K80 Off | 00000000:8A:00.0 Off | 0 |
| N/A 30C P8 25W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla K80 Off | 00000000:8B:00.0 Off | 0 |
| N/A 25C P8 29W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
更多推荐
已为社区贡献11条内容
所有评论(0)