一.素材:

1.Nvidia显卡一张(笔者使用Quadro P1000 4G)

2.一台安装好CentOS 8的电脑(其它版本没试过,估计相差无几)

3.准备好网络连接,需要下载一些安装包

4.不要尝试在虚拟上玩这个.

 

二.前期准备

1.检查本机情况安装好相关软件包,以下列出笔者碰到的必须的,通常安装软件需要有root权限,先su到root账号

2.使用"nvidia-detect"检查一下本机的驱动安装情况

#nvidia-detect -v

系统提示没有这个命令?没关系马上用下面的命令装一个:

rpm -Uvh http://mirror.rackspace.com/elrepo/elrepo/el7/x86_64/RPMS/nvidia-detect-440.36-1.el7.elrepo.x86_64.rpm

如果

3.相关的软件或者开发包

<1>kernel-devel

yum install kernel-devel

<2>elfutils-libelf-devel

yum install elfutils-libelf-devel

<3>gcc是必须的

yum install gcc
yum install gcc-c++
 

4.下载我们要安装的驱动,笔者选择了NVIDIA-Linux-x86_64-418.88.run这个安装包

wget https://cn.download.nvidia.cn/XFree86/Linux-x86_64/418.88/NVIDIA-Linux-x86_64-418.88.run

chmod a+x ./NVIDIA-Linux-x86_64-418.88.run
./NVIDIA-Linux-x86_64-418.88.run

reboot now

这里着重提示,NVIDIA-Linux-x86_64-418.88.run,注意后面这几个数字418.88,418应该跟你的内核版本一致.

例如:笔者的内核版本如下:

# yum list | grep kernel
kernel.x86_64                                              4.18.0-80.el8                                           @anaconda
kernel.x86_64                                              4.18.0-147.5.1.el8_1                               @BaseOS   
kernel-core.x86_64                                    4.18.0-80.el8                                            @anaconda
kernel-core.x86_64                                    4.18.0-147.5.1.el8_1                               @BaseOS   
kernel-devel.x86_64                                  4.18.0-147.5.1.el8_1                               @BaseOS   

 

马上检查一下,看看是不是成功了.

# nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.88       Driver Version: 418.88       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P1000        Off  | 00000000:01:00.0  On |                  N/A |
| 34%   50C    P0    N/A /  N/A |    159MiB /  4039MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2866      G   /usr/libexec/Xorg                             64MiB |
|    0      2974      G   /usr/bin/gnome-shell                          90MiB |
|    0      4014      G   /usr/lib64/firefox/firefox                     1MiB |
+-----------------------------------------------------------------------------+

 

三.可能出现的问题及处理:

1.ERROR: You appear to be running an X server;

表示你不能在图形界面下安装驱动,解决办法:修改系统启动模式,不再默认进入图形界面,而是进入字符界面

!!注意!! CentOS 8不再支持修改/etc/inittab来定义启动模式了,请使用以下命令进行操作

systemctl get-default  -->这个命令表示查看当前启动模式,可以忽略

systemctl set-default multi-user.target  -->这个命令表示设置成启动进入字符模式

reboot now

这样重启后就进入字符模式了,完事后可以用以下命令修改启动模式.

systemctl set-default graphical.taget

2.ERROR: The Nouveau kernel driver is currently in use by your system.

表示系统已经安装了Nouveau的显卡驱动,需要禁用或者干掉这个驱动

A.编辑/etc/modprobe.d/blacklist.conf
vi /etc/modprobe.d/blacklist.conf

这个文件可能是全新的,不要管它,直接加入
blacklist nouveau

B.运行命令备份与重建initramfs

mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak

dracut -v /boot/initramfs-$(uname -r).img $(uname -r)
重启系统

reboot now

你会发现字符都变大了,不过没关系,驱动安装完之后就正常了.

3.ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details

这个错误提示让人崩溃,没办法只能查看LOG了,笔者的结果如下:

   In file included from /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/common/inc/nv-linux.h:136,
                    from /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/nvidia/nv-chrdev.c:15:
   /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/common/inc/nv-list-helpers.h:94:19: error: redefinition of ‘list_is_first’
      94 | static inline int list_is_first(const struct list_head *list,
         |                   ^~~~~~~~~~~~~
   In file included from /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/common/inc/nv-linux.h:136,
                    from /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/nvidia/nv-acpi.c:15:
   /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/common/inc/nv-list-helpers.h:94:19: error: redefinition of ‘list_is_first’
      94 | static inline int list_is_first(const struct list_head *list,
         |                   ^~~~~~~~~~~~~
   In file included from /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/common/inc/nv-linux.h:136,
                    from /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/nvidia/nv-frontend.c:13:
   /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/common/inc/nv-list-helpers.h:94:19: error: redefinition of ‘list_is_first’
      94 | static inline int list_is_first(const struct list_head *list,
         |                   ^~~~~~~~~~~~~
   In file included from /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/common/inc/nv-linux.h:136,
                    from /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/nvidia/nv-instance.c:13:
   /tmp/selfgz1186/NVIDIA-Linux-x86_64-390.87/kernel/common/inc/nv-list-helpers.h:94:19: error: redefinition of ‘list_is_first’
      94 | static inline int list_is_first(const struct list_head *list,
         |                   ^~~~~~~~~~~~~

问题在于系统内核与驱动内核存在版本上的差异,需要用相匹配的版本<<注意nvidia提供的驱动包的后面几个数字,它们对应着你的内核的版本>>.

 

四.关键点总结

写了这么多东西,其实最关键的点在于<<注意nvidia提供的驱动包的后面几个数字,它们对应着你的内核的版本>>尽量选择下载与你系统内核主版本号相同的驱动包下载和安装.

 

Logo

更多推荐