准备 nvidia 驱动和CUDA

  1. 下载nvidia驱动和CUDA
    nvdia驱动网址https://www.nvidia.cn/Download/index.aspx?lang=cn
    CUDA网址https://developer.nvidia.com/cuda-toolkit-archive

  2. 建立nvidia文件夹并拷贝
    sudo mkdir /work
    sudo chown -R casia:casia /work/
    cd /work/
    sudo apt-get update
    sudo apt-get install -y gcc make python3-pip
    mkdir nvidia
    cd nvidia/
    将下载好的nvidia驱动和CUDA拷贝到改文件夹

  3. 安装nvidia驱动和CUDA
    sudo sh NVIDIA-Linux-x86_64-450.102.04.run 三次回车
    sudo sh cuda_11.0.2_450.51.05_linux.run
    键入accept回车->选择Install回车

  4. 检验
    nvidia-smi
    4 安装 nvidia-docker
    在使用带有 cuda 环境的 docker 容器之前,首先需要安装 nvidia-docker 组件

安装docker

sudo apt-get update
sudo apt-get install -y
apt-transport-https
ca-certificates
curl
gnupg-agent
software-properties-common

curl -fsSL https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/gpg | sudo apt-key add -

sudo add-apt-repository
“deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/
$(lsb_release -cs)
stable”

sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io

sudo gpasswd -a ${USER} docker
sudo service docker restart

添加 nvidia-docker 源

sudo apt-get install curl

$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu18.04/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update
4.3 安装 nvidia-docker2
安装 nvidia-docker2 后重启 docker 使得 nvidia-docker2 生效。
$ sudo apt-get install -y nvidia-docker2 vim
$ sudo systemctl restart docker

配置nvidia-docker

修改/etc/docker/daemon.json文件配置如下
sudo vim /etc/docker/daemon.json
{
“default-runtime”: “nvidia”,
“runtimes”: {
“nvidia”: {
“path”: “nvidia-container-runtime”,
“runtimeArgs”: []
}
}
}

sudo systemctl daemon-reload
sudo systemctl restart docker

关闭swap

sudo swapoff -a

vim /etc/fstab

注释掉/swapfile一行

安装k8s

sudo apt-get update && apt-get install -y apt-transport-https
sudo curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
echo "deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update

# 查看版本
apt-cache madison kubelet
sudo apt-get install -y kubelet=1.20.5-00 kubeadm=1.20.5-00 kubectl=1.20.5-00

sudo systemctl enable kubelet
sudo systemctl start kubelet

编辑docker daemon.json

vim /etc/docker/daemon.json

# 添加 "exec-opts": ["native.cgroupdriver=systemd"]

重启docker
systemctl restart docker

添加节点

master上执行
获取token

kubeadm token list
kubeadm token create --print-join-command

# kubeadm join 192.168.1.2:6443 --token 6yex72.30fxcz9l7ps0zuap     --discovery-token-ca-cert-hash sha256:7b97fc66dad88395c35f0164e3c8dcb172476494043381fe5b35acd697f5ad1

worker上执行
sudo kubeadm join 192.168.1.2:6443 --token 6yex72.30fxcz9l7ps0zuap --discovery-token-ca-cert-hash sha256:7b97fc66dad88395c35f064e3c8dcb172476b494043381fe5b35acd697f5ad1

安装nfs

sudo apt install -y nfs-common

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐