资源

节点名称IP配置系统
node310.2.20.174核8GUbuntu Server 22.04 LTS 64bit
node210.2.24.44核8GUbuntu Server 22.04 LTS 64bit
node110.2.20.134核8GUbuntu Server 22.04 LTS 64bit
master10.2.24.104核8GUbuntu Server 22.04 LTS 64bit

ps:所有命令尽量使用root账号操作

1.设置节点名称,留一台当做master

#newhostname改为节点名称,每台服务器都要执行,规划好名称,名称会在k8s显示。
sudo hostnamectl set-hostname [newhostname]

2.关闭swap,selinux,firewalld

# 临时关闭swap
sudo swapoff -a
# 永久关闭swap,打开后找到其中未注释的代码,如下图所示
sudo vi /etc/fstab

请添加图片描述

# 临时关闭selinux
sudo setenforce 0
# 永久关闭selinux
sudo vi /etc/selinux/config
# 将SELINUX=enforcing改为SELINUX=disabled,保存
# 关闭防火墙
sudo systemctl stop ufw
sudo systemctl disable ufw

3.在每台服务器上执行命令安装microk8s

# 最新版本安装,选择一种,不要两个都执行
sudo snap install microk8s --classic
# 指定版本安装
sudo snap install microk8s --classic --channel=1.24/stable

安装成功后会有类似于如下输出

microk8s (1.25/stable) v1.25.2 from Canonical✓ installed

4.设置K8s 命令别名(alias)(仅在master上执行)

把一些常用的命令设置为别名,更方便
# 设置 microk8s.kubectl 命令为 kubectl
sudo snap alias microk8s.kubectl kubectl
# 设置成功后出现以下的输出,代表设置成功
Added:
  - microk8s.kubectl as kubectl
# 设置 microk8s.ctr 命令为 ctr
sudo snap alias microk8s.ctr ctr
ps:也可以设置其他的常用命令,设置方式类似

5.检查microk8s status状态及解决

ps:如果已启动则跳过该步骤

#执行状态查询
sudo microk8s status
#如果出现以下则表示未启动
microk8s is not running. Use microk8s inspect for a deeper inspection.

我们可以执行microk8s.inspect查看是否有报错信息,如果没有任何报错,可以来通过kubectl来进一步排查问题到底出现在了哪里

sudo kubectl get pods --all-namespaces

# 命令执行完毕,将得到类似下面的日志结果:
NAMESPACE     NAME                                       READY   STATUS     RESTARTS   AGE
kube-system   calico-node-nnshm                          0/1     Init:0/2   0          9m
kube-system   calico-kube-controllers-67774c44db-mcmvx   0/1     Pending    0          9m

使用kubectl describe pod来查看 STATUS 状态为 Init 的这个pod:

sudo kubectl describe pod calico-node-nnshm -n kube-system

# 得到如下的输出:
Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image 
"registry.k8s.io/pause:3.7": failed to pull image "registry.k8s.io/pause:3.7": failed to pull and unpack 
image "registry.k8s.io/pause:3.7": failed to resolve reference "registry.k8s.io/pause:3.7": failed to do 
request: Head "https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.7": dial 
tcp 142.250.157.82:443: i/o timeout

ps:如果报错是上面这个,则切换未阿里云数据源

sudo ctr image pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.7

# 拉取成功: 提示如下
registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.7:                    resolved       |++++++++++++++++++++++++++++++++++++++| 
index-sha256:bb6ed397957e9ca7c65ada0db5c5d1c707c9c8afc80a94acbe69f3ae76988f0c:    done           |++++++++++++++++++++++++++++++++++++++| 
manifest-sha256:f81611a21cf91214c1ea751c5b525931a0e2ebabe62b3937b6158039ff6f922d: done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:7582c2cc65ef30105b84c1c6812f71c8012663c6352b01fe2f483238313ab0ed:    done           |++++++++++++++++++++++++++++++++++++++| 
config-sha256:221177c6082a88ea4f6240ab2450d540955ac6f4d5454f0e15751b653ebda165:   done           |++++++++++++++++++++++++++++++++++++++| 
elapsed: 1.8 s                                                                    total:  304.0  (168.8 KiB/s)                                     
unpacking linux/amd64 sha256:bb6ed397957e9ca7c65ada0db5c5d1c707c9c8afc80a94acbe69f3ae76988f0c...
done: 26.431318ms

然后更改tag,把registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.7镜像更改为registry.k8s.io/pause:3.7

sudo ctr image tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.7 registry.k8s.io/pause:3.7

# 重启 microk8s
sudo microk8s stop && microk8s start

# 重启后在查看 microk8s 的状态
sudo microk8s status

microk8s is running
high-availability: no
  datastore master nodes: 127.0.0.1:19001
  datastore standby nodes: none
addons:
  enabled:
    ha-cluster           # (core) Configure high availability on the current node
    helm                 # (core) Helm - the package manager for Kubernetes
    helm3                # (core) Helm 3 - the package manager for Kubernetes
  disabled:
    cert-manager         # (core) Cloud native certificate management
    community            # (core) The community addons repository
    dashboard            # (core) The Kubernetes dashboard
    dns                  # (core) CoreDNS
    gpu                  # (core) Automatic enablement of Nvidia CUDA
    host-access          # (core) Allow Pods connecting to Host services smoothly
    hostpath-storage     # (core) Storage class; allocates storage from host directory
    ingress              # (core) Ingress controller for external access
    kube-ovn             # (core) An advanced network fabric for Kubernetes
    mayastor             # (core) OpenEBS MayaStor
    metallb              # (core) Loadbalancer for your Kubernetes cluster
    metrics-server       # (core) K8s Metrics Server for API access to service metrics
    minio                # (core) MinIO object storage
    observability        # (core) A lightweight observability stack for logs, traces and metrics
    prometheus           # (core) Prometheus operator for monitoring and logging
    rbac                 # (core) Role-Based Access Control for authorisation
    registry             # (core) Private image registry exposed on localhost:32000
    storage              # (core) Alias to hostpath-storage add-on, deprecated

如果还是没启起来,再次查看pod的状态,如果显示以下的错误,代表无法拉取calico插件,需要去github上面手动拉取。地址:https://github.com/projectcalico/calico
在这里插入图片描述
下载对应的版本包,下载完成后,把images目录上传到服务器上,导入需要的镜像包即可,我这里下载的3.25.1
也可以从百度网盘下载,链接: https://pan.baidu.com/s/1tEodKVwFEtfQwYz2UW3MQg 提取码: 2336
在这里插入图片描述
上传至任意目录,然后进入该目录执行以下命令

#解压包
sudo tar -xzvf images.tgz
#导入包
sudo ctr images import calico-kube-controllers.tar && ctr images import calico-cni.tar && ctr images import calico-node.tar

成功后显示
在这里插入图片描述

6.拉取coredns的国内镜像

sudo ctr images pull registry.aliyuncs.com/google_containers/coredns:1.10.1
# 修改coredns的镜像
sudo kubectl edit deploy coredns -n kube-system
# 把镜像修改成 registry.aliyuncs.com/google_containers/coredns:1.10.1

7.配置 K8s 集群
7.1 先修改 master 服务器的hosts文件

sudo vi /etc/hosts

# 把 master 的节点IP和hostname以及所有 node 节点的IP和hostname写进去
127.0.1.1 localhost.localdomain VM-0-8-ubuntu
127.0.0.1 localhost
10.2.24.10 master       # master 的节点IP和hostname
110.2.20.13 node1        # node 节点的IP和hostname
10.2.24.4 node2
10.2.20.17 node3

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

7.2 所有 node 服务器的hosts文件都需要更改

sudo vi /etc/hosts

# 把当前 node 节点的IP和hostname写进去
127.0.1.1 localhost.localdomain VM-0-6-ubuntu
127.0.0.1 localhost
110.2.20.13 node1

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

7.3 在master服务器上执行

sudo microk8s add-node
生成如下的命令
From the node you wish to join to this cluster, run the following:
microk8s join 10.2.24.10:25000/89f1088fa37a9e79da47556830a4ce77/f619bfa7e77c

Use the '--worker' flag to join a node as a worker not running the control plane, eg:
microk8s join 10.2.24.10:25000/89f1088fa37a9e79da47556830a4ce77/f619bfa7e77c --worker   
# 复制这一条命令在 node 服务器上执行,需要注意的是生成的命令是一次性的,所以如果我们需要添加多个工作节点,需要重复在master中执行 microk8s add-node,获得带有不同的 token 参数的命令

If the node you are adding is not reachable through the default interface you can use one of the following:
microk8s join 10.2.24.10:25000/89f1088fa37a9e79da47556830a4ce77/f619bfa7e77c

7.4 在node服务器上执行

#这条命令从7.3执行结果复制,每台node只能执行一次,且注意25000端口需要在云服务器开放,避免访问不通
#命令需要等待20秒左右后会成功
sudo microk8s join 10.2.24.10:25000/ebbe369047950c6e4262e82e14d4d0ef/f619bfa7e77c --worker
Contacting cluster at 172.27.0.8

The node has joined the cluster and will appear in the nodes list in a few seconds.

This worker node gets automatically configured with the API server endpoints.
If the API servers are behind a loadbalancer please set the '--refresh-interval' to '0s' in:
    /var/snap/microk8s/current/args/apiserver-proxy
and replace the API server endpoints with the one provided by the loadbalancer in:
    /var/snap/microk8s/current/args/traefik/provider.yaml

7.5 在master上检查所有节点的运行状况

sudo kubectl get pod -o=custom-columns=NAME:.metadata.name,STATUS:.status.phase,NODE:.spec.nodeName --all-namespaces
NAME                                       STATUS    NODE
calico-kube-controllers-67774c44db-mcmvx   Running   potato
calico-node-sgj2r                          Running   potato
calico-node-9pvcz                          Running   spud
calico-node-45kcb                          Running   murphy

7.6 开启microk8s的组件

# 开启dns组件
microk8s enable dns
# 开启rbac组件
microk8s enable rbac

7.7 安装kuboard-v3 , 需要开启30080端口与10081端口

# 安装docker,安装依赖
sudo apt install apt-transport-https ca-certificates curl gnupg lsb-release
# 添加阿里云Docker镜像源GPG秘钥
sudo curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
# 添加阿里云镜像源
sudo echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# 更新apt
sudo apt update
# 安装docker
sudo apt install docker-ce
# 安装kuboard
sudo docker run -d \
  --restart=unless-stopped \
  --name=kuboard \
  -p 30080:80/tcp \
  -p 10081:10081/tcp \
  -e KUBOARD_ENDPOINT="http://10.0.0.42:30080" \
  -e KUBOARD_AGENT_SERVER_TCP_PORT="10081" \
  -v /root/kuboard-data:/data \
  swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3

7.8 配置NFS

# 在master服务器上安装NFS服务器,执行命令安装NFS服务器
sudo apt install nfs-kernel-server
# 打开nfs服务器配置文件/etc/exports,指定nfs服务器共享目录及其属性, /mnt  *(rw,sync,no_root_squash)
sudo vi /etc/exports
# 重启NFS服务器
sudo service nfs-kernel-server restart
# 在所有node服务器上安装NFS客户端
sudo apt install nfs-common

7.9 给master添加污点以避免业务容器调度到master节点上(已有则忽略)。 如果允许master调度业务容器,那么这里就跳过

# 添加污点
sudo kubectl taint nodes master node-role.kubernetes.io/master=:NoSchedule

8.安装Python2

# 更新apt
sudo apt-get update
# 安装python2
sudo apt-get install python2
# 设置python替代方案
update-alternatives --install /usr/bin/python python /usr/bin/python2 1
update-alternatives --install /usr/bin/python python /usr/bin/python3 2
# 查看是否设置成功
update-alternatives --list python
# 切换python版本,输入python2对应的数字即可
update-alternatives --config python
# 查看切换是否成功
python --version

至此k8s搭建成功, 访问地址,master节点的公网ip加上端口第七步执行的docker启动端口30080
http://ip:30080/kuboard/cluster
账号密码默认
admin
Kuboard123

k8s常见问题
1.访问k8s节点端口特别慢,在自己的节点上运行就返回很快:
将网卡的校验关闭即可

ethtool -K vxlan.calico tx-checksum-ip-generic off

2.Microk8s提示deployments.apps is forbidden: User “system:serviceaccount:kube-system:default” cannot list resource “deployments” in API group “apps” in the namespace “kube-system”

kubectl auth can-i list deployments --namespace kube-system --as <当前登录的用户名>

如果提示no则执行如下命令进行授权:

kubectl create clusterrolebinding my-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:<当前登录的用户名>

3.安装kuboardV3的时候执行 watch kubectl get pods -n kuboard 如果没有etcd的pod,就要手动去添加一个节点的etcd

# 默认在master上添加一个etcd
kubectl label nodes master node-role.kubernetes.io/master=
kubectl label nodes master node.kubernetes.io/microk8s-controlplane=microk8s-controlplane

# 如果需要多个etcd,可以在其他的节点上在添加标签
kubectl label nodes 【your-node-name】 k8s.kuboard.cn/role=etcd
ps:etcd数量最好为奇数

4.证书延期

  • 4.1 配置每半年自动执行一次更新(时间随意)
# 查看证书过期时间
microk8s.refresh-certs -c
# 延长ca证书
microk8s.refresh-certs -e ca.crt
# 延长server证书
microk8s.refresh-certs -e server.crt
# 延长front证书
microk8s.refresh-certs -e front-proxy-client.crt

把上面的命令写入一个sh脚本,例如:cert.sh

# 给sh脚本赋予执行权限
chmod +x cert.sh
# 配置定时任务
crontab -e
0 0 1 */6 * /mnt/cert.sh >> /mnt/cert.log
  • 4.2 禁用snap自动更新
# 禁用自动更新
snap refresh --hold
# 启用自动更新
snap refresh --unhold
Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐