1、简介

这里就不赘述,想要了解的朋友直接去这里深入了解什么是K8S

2、环境要求

  • 2台以上机器,操作系统 CentOS7.7-64位系统
  • 硬件配置:2GB或更多RAM,2个CPU或更多CPU,硬盘30GB或更多
  • 集群中所有机器之间网络互通
  • 可以访问外网,需要拉取镜像 禁止swap分区

3、部署准备(我这里是使用虚拟机,可以买云服务器)

4、开始部署

4.1、安装docker

我准备了2台机器172.168.200.130(master)、172.168.200.131(node1),也测试了2台是内网互通。

[root@localhost ~]# ping 172.168.200.130  #node1 ping master节点
PING 172.168.200.130 (172.168.200.130) 56(84) bytes of data.
64 bytes from 172.168.200.130: icmp_seq=1 ttl=64 time=0.243 ms
64 bytes from 172.168.200.130: icmp_seq=2 ttl=64 time=0.142 ms
64 bytes from 172.168.200.130: icmp_seq=3 ttl=64 time=0.192 ms
64 bytes from 172.168.200.130: icmp_seq=4 ttl=64 time=0.224 ms

[root@localhost ~]# ping 172.168.200.131 #master ping node1节点
PING 172.168.200.131 (172.168.200.131) 56(84) bytes of data.
64 bytes from 172.168.200.131: icmp_seq=1 ttl=64 time=0.021 ms
64 bytes from 172.168.200.131: icmp_seq=2 ttl=64 time=0.110 ms
64 bytes from 172.168.200.131: icmp_seq=3 ttl=64 time=0.035 ms
64 bytes from 172.168.200.131: icmp_seq=4 ttl=64 time=0.042 ms

所有机器都必须安装docker环境。


 #查看系统是否已安装docker
rpm -qa|grep docker

#卸载旧版本docker
sudo yum remove docker*  

 #安装yum工具
sudo yum install -y yum-utils  device-mapper-persistent-data  lvm2

#配置docker的yum下载地址
sudo yum-config-manager \
--add-repo \
http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo 

#生成缓存
sudo yum makecache 

#查看docker版本
yum list docker-ce --showduplicates | sort -r 

在这里插入图片描述
选择19.3.9版本安装

 #安装docker的指定版本
sudo yum install -y docker-ce-19.03.9-3.el7 docker-ce-cli-19.03.9-3.el7 containerd.io

#配置开机启动且立即启动docker容器
systemctl enable docker --now 

#创建docker配置
sudo mkdir -p /etc/docker 

 #配置docker的镜像加速
sudo tee /etc/docker/daemon.json <<-EOF
{
  "registry-mirrors": ["https://82m9ar63.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

#加载配置
sudo systemctl daemon-reload 
#重启docker
sudo systemctl restart docker 

##查看docker版本,看是否安装成功
[root@localhost ~]# docker version
Client: Docker Engine - Community
 Version:           19.03.9
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        9d988398e7
 Built:             Fri May 15 00:25:27 2020
 OS/Arch:           linux/amd64
 Experimental:      false
......

4.2、安装Kubernetes

所有机器配置自己的hostname(不能是localhost)172.168.200.130机器我配置为master,172.168.200.131为node1。

hostnamectl set-hostname master #在172.168.200.130执行
hostnamectl set-hostname node1 #在172.168.200.131执行

所有机器必须关闭swap分区,不为0则说明没有关闭;禁用selinux;允许 iptables 检查桥接流量(k8s官网)。

[root@localhost ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:           1980         174        1474           9         331        1655
Swap:          3071           0        3071

##关闭swap分区
swapoff -a  
sed -ri 's/.*swap.*/#&/' /etc/fstab 

## 把SELinux 设置为 permissive 模式(相当于禁用)
sudo setenforce 0 
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config 

## 允许 iptables 检查桥接流量 
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system

4.3、安装kubelet、kubeadm、kubectl

所有机器配置k8s的yum源地址及安装并启动kubelet。

#配置k8s的yum源地址
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
   http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

#安装 kubelet,kubeadm,kubectl
sudo yum install -y kubelet-1.20.9 kubeadm-1.20.9 kubectl-1.20.9

#启动kubelet
sudo systemctl enable --now kubelet

#所有机器配置master域名
echo "172.168.200.130  master" >> /etc/hosts

4.4、初始化master主节点

我这里是把172.168.200.130作为master,–apiserver-advertise-address值为master的IP、–control-plane-endpoint值为master的域名、–image-repository 值为镜像仓库、–kubernetes-version指定k8s的版本、–service-cidr指定service的网段、–pod-network-cidr指定pod的网段。更多初始化参数详情点击这里

kubeadm init \
--apiserver-advertise-address=172.168.200.130 \
--control-plane-endpoint=master \
--image-repository registry.cn-hangzhou.aliyuncs.com/lfy_k8s_images \
--kubernetes-version v1.20.9 \
--service-cidr=10.96.0.0/16 \
--pod-network-cidr=192.168.0.0/16 

初始话完毕后,需要记录如下信息,后续会使用到。

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

  kubeadm join master:6443 --token rk6er7.7pmgpxdkq12t45r7 \
    --discovery-token-ca-cert-hash sha256:7045e72a64def658c8ce1ebebdc6e149326c0c7fa4815b387a8edfc7e2123f97 \
    --control-plane 

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join master:6443 --token rk6er7.7pmgpxdkq12t45r7 \
    --discovery-token-ca-cert-hash sha256:7045e72a64def658c8ce1ebebdc6e149326c0c7fa4815b387a8edfc7e2123f97 

执行上述记录的信息,然后查看master的运行情况,发现有2个名为 coredns-*还未运行成功,这里就需要安装网络插件。

## 在master节点执行
 mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

4.5、安装网络插件

前置准备,我这里不适用防火墙来控制转发,而是使用ipvs(k8s版本必须是1.18以上(含))。

##所有机器开启ipvs,先检查是否有ipvs 所需模块,只要报command not found就是没有。
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4

## 引入ipvs所需模块
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4

##更改kube-proxy的模式为ipvs
[root@master ~]# kubectl edit configMap kube-proxy -n kube-system
 ipvs:
  ......
    kind: KubeProxyConfiguration
    metricsBindAddress: ""
    mode: "ipvs" #设置为ipvs,不设置默认使用iptables
    
##重启所有的kube-proxy
[root@master ~]# kubectl get pod -A | grep kube-proxy | awk '{system("kubectl delete pod "$2" -n kube-system")}'
pod "kube-proxy-689h8" deleted

##查看k8s主节点运行情况
[root@master ~]# kubectl get pod -A
NAMESPACE     NAME                             READY   STATUS    RESTARTS   AGE
kube-system   coredns-5897cd56c4-56mmj         0/1     Pending   0          8m38s
kube-system   coredns-5897cd56c4-mgfmh         0/1     Pending   0          8m38s
kube-system   etcd-master                      1/1     Running   0          8m51s
kube-system   kube-apiserver-master            1/1     Running   0          8m51s
kube-system   kube-controller-manager-master   1/1     Running   0          8m51s
kube-system   kube-proxy-l6946                 1/1     Running   0          8m38s
kube-system   kube-scheduler-master            1/1     Running   0          8m51s

##查看proxy是否以ipvs模式运行,发现已经换成了IPv4
[root@master ~]# kubectl logs kube-proxy-l6946 -n kube-system
......
I0623 09:03:38.347008       1 server_others.go:258] Using ipvs Proxier.


安装网络插件。我这里使用的是calico。这一步必须在所有work节点加入之前操作。

##下载calico的配置文件(最新)
curl https://docs.projectcalico.org/manifests/calico.yaml -O >> calico.yaml

##引入calico文件,发现报错。其实是版本问题,重新下载v3.20即可
[root@master ~]# kubectl apply -f calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
......
error: unable to recognize "calico.yaml": no matches for kind "PodDisruptionBudget" in version "policy/v1"

##重新下载calico插件,报错就解决了。
[root@master ~]# curl https://docs.projectcalico.org/v3.20/manifests/calico.yaml -O  >> calico.yaml
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  198k  100  198k    0     0   229k      0 --:--:-- --:--:-- --:--:--  229k
[root@master ~]# 
[root@master ~]# kubectl apply -f calico.yaml  #卸载则使用kubectl delete -f calico.yaml 
configmap/calico-config unchanged
......
[root@master ~]# 

##再次查看master节点的运行情况,发现之前的coredns-*也已经运行起来了,到此master节点配置完毕。
[root@master ~]# kubectl get pod -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-6d9cdcd744-49vrr   1/1     Running   0          12m
kube-system   calico-node-8gpm2                          1/1     Running   0          13m
kube-system   coredns-5897cd56c4-56mmj                   1/1     Running   0          28m
kube-system   coredns-5897cd56c4-mgfmh                   1/1     Running   0          28m
kube-system   etcd-master                                1/1     Running   0          28m
kube-system   kube-apiserver-master                      1/1     Running   0          28m
kube-system   kube-controller-manager-master             1/1     Running   0          28m
kube-system   kube-proxy-l6946                           1/1     Running   0          119s
kube-system   kube-scheduler-master                      1/1     Running   0          28m

4.6、node工作节点加入master节点

还记得4.4步骤,初始化master节点后的信息吗?work加入master节点的命令如下,需要切换到非master节点的机器上执行。

##发现无法加入,这是因为我们开了防火墙,所以关闭即可(生产不能关闭,需要放行对应端口)
[root@node1 ~]# kubeadm join master:6443 --token rk6er7.7pmgpxdkq12t45r7     --discovery-token-ca-cert-hash sha256:7045e72a64def658c8ce1ebebdc6e149326c0c7fa4815b387a8edfc7e2123f97
[preflight] Running pre-flight checks
	[WARNING Hostname]: hostname "node1" could not be reached
	[WARNING Hostname]: hostname "node1": lookup node1 on 114.114.114.114:53: no such host

## 查看防火墙运行状态
[root@master ~]# service firewalld status
Redirecting to /bin/systemctl status firewalld.service
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2022-06-23 15:16:19 CST; 1h 32min ago
     Docs: man:firewalld(1)
 Main PID: 591 (firewalld)
    Tasks: 2
   Memory: 284.0K
   CGroup: /system.slice/firewalld.service
           └─591 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid
           
##所有机器停止且永久关闭防火墙(生产不能关闭,这里只是用作学习)
[root@master ~]# sudo service firewalld stop && systemctl disable firewalld
Redirecting to /bin/systemctl stop firewalld.service

## 再次执行加入命令,发现报错(这里是k8s与docker的驱动不同的问题),我们docker设置的是systemd
[root@node1 ~]# kubeadm join master:6443 --token rk6er7.7pmgpxdkq12t45r7     --discovery-token-ca-cert-hash sha256:7045e72a64def658c8ce1ebebdc6e149326c0c7fa4815b387a8edfc7e2123f97
......
[kubelet-check] It seems like the kubelet is not running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

## 查看k8s的驱动,发现与我们配置的docker不一致
[root@master ~]# cat /var/lib/kubelet/config.yaml |grep group
cgroupDriver: cgroupfs

##修改k8s的驱动
[root@master ~]# kubectl edit cm kubelet-config-1.20 -n kube-system

##重新执行节点重新加入,这样基本算成功了,
[root@node1 ~]# kubeadm join master:6443 --token 35h05t.96u192eube0wy9vk     --discovery-token-ca-cert-hash sha256:6173250bbe1e64ff86ba0502cddff67de573ce75c2fb0054afcb64d8769c1a0b
......
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

主节点查看work节点运行情况,发现node1节点没有准备好,处理node1异常情况如下:

[root@master ~]# kubectl get nodes
NAME     STATUS     ROLES                  AGE    VERSION
master   Ready      control-plane,master   46m    v1.20.9
node1    NotReady   <none>                 5m5s   v1.20.9

##查看pod的运行情况,发现有一个calico的pod节点运行失败
[root@master ~]# kubectl get pod -A 
NAMESPACE     NAME                                       READY   STATUS                  RESTARTS   AGE
kube-system   calico-kube-controllers-6d9cdcd744-h4tzc   1/1     Running                 0          43m
kube-system   calico-node-hmzdn                          0/1     Init:ImagePullBackOff   0          6m10s
kube-system   calico-node-ntrcr                          1/1     Running                 0          43m
kube-system   coredns-5897cd56c4-s84wj                   1/1     Running                 0          47m
kube-system   coredns-5897cd56c4-t7tx7                   1/1     Running                 0          47m
kube-system   etcd-master                                1/1     Running                 0          47m
kube-system   kube-apiserver-master                      1/1     Running                 0          47m
kube-system   kube-controller-manager-master             1/1     Running                 0          47m
kube-system   kube-proxy-hgh7j                           1/1     Running                 0          44m
kube-system   kube-proxy-v2mqr                           1/1     Running                 0          6m10s
kube-system   kube-scheduler-master                      1/1     Running                 0          47m

##查看构建情况,发现是node1节点的cni镜像下载失败,
[root@master ~]# kubectl describe pod calico-node-hmzdn -n kube-system
......
  Warning  Failed     2m4s                   kubelet            Failed to pull image "docker.io/calico/cni:v3.20.5": rpc error: code = Unknown desc = Get https://registry-1.docker.io/v2/calico/cni/manifests/sha256:8fb230289086a9962799e055c93bc51c74d16158ba09ab23c619af509419f90d: read tcp 172.168.200.131:45062->34.237.244.67:443: read: connection reset by peer
  Normal   BackOff    80s (x7 over 5m58s)    kubelet            Back-off pulling image "docker.io/calico/cni:v3.20.5"
  Warning  Failed     80s (x7 over 5m58s)    kubelet            Error: ImagePullBackOff
  
##在node1查看cni:v3.20.5镜像已经下载,那么重新到master节点把calico-node-hmzdn删掉重启即可
[root@node1 ~]# docker images|grep cni
calico/cni                                                    v3.20.5             73dc08b04b58        2 months ago        138MB
[root@node1 ~]# 

## 删除旧的pod,k8s会自动创建新的pod,等一会发现正常运行。
[root@master ~]# kubectl delete pod calico-node-hmzdn -n kube-system
pod "calico-node-hmzdn" deleted
[root@master ~]# kubectl get pod -A 
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-6d9cdcd744-h4tzc   1/1     Running   0          46m
kube-system   calico-node-ntrcr                          1/1     Running   0          46m
kube-system   calico-node-qbpf5                          1/1     Running   0          6s
kube-system   coredns-5897cd56c4-s84wj                   1/1     Running   0          50m
kube-system   coredns-5897cd56c4-t7tx7                   1/1     Running   0          50m
kube-system   etcd-master                                1/1     Running   0          50m
kube-system   kube-apiserver-master                      1/1     Running   0          50m
kube-system   kube-controller-manager-master             1/1     Running   0          50m
kube-system   kube-proxy-hgh7j                           1/1     Running   0          48m
kube-system   kube-proxy-v2mqr                           1/1     Running   0          9m19s
kube-system   kube-scheduler-master                      1/1     Running   0          50m

## 在执行kubectl get nodes,发现node1节点已经成功加入主节点。
[root@master ~]# kubectl get nodes
NAME     STATUS   ROLES                  AGE   VERSION
master   Ready    control-plane,master   53m   v1.20.9
node1    Ready    <none>                 12m   v1.20.9

4.7 当master init有误时

这里是对master重置,不需要重置的略过。

强制重置k8s节点
[root@master ~]# kubeadm reset -f

移除相关数据
[root@master ~]#  rm -rf /etc/cni /etc/kubernetes /var/lib/dockershim /var/lib/etcd /var/lib/kubelet /var/run/kubernetes ~/.kube/*

k8s这些镜像没有的从这里拿吧。
链接:https://pan.baidu.com/s/1LFDY4gEMvOKinyhq08cClQ?pwd=0rqy
提取码:0rqy

[root@master ~]# kubeadm config images list --kubernetes-version v1.20.9
kube-apiserver:v1.20.9
kube-controller-manager:v1.20.9
kube-scheduler:v1.20.9
kube-proxy:v1.20.9
pause:3.2
etcd:3.4.13-0
coredns:1.7.0

后续实战可以看这篇文章【K8S实战】-超详细教程(一)

:有开启防火墙的小伙伴需要放行一些端口,具体放行哪些端口详情点这里

资料参考:

Logo

开源、云原生的融合云平台

更多推荐