k8s(v1.19.4)高可用集群搭建

一、服务器规划1、环境说明：主机名ipdockerkubeletkubeadmkubectl备注k8s-master1192.168.209.13219.03.131.19.41.19.41.19.4

smx_yms

1236人浏览 · 2022-04-28 16:27:01

smx_yms · 2022-04-28 16:27:01 发布

一、服务器规划

1、环境说明：

主机名	ip	docker	kubelet	kubeadm	kubectl	备注
k8s-master1	192.168.209.132	19.03.13	1.19.4	1.19.4	1.19.4	master1
k8s-master2	192.168.209.133	19.03.13	1.19.4	1.19.4	1.19.4	master2
k8s-node	192.168.209.139	19.03.13	1.19.4	1.19.4	1.19.4	node
k8s-vip	192.168.209.140	19.03.13	1.19.4	1.19.4	1.19.4	vip

2、系统镜像

[root@localhost ~]# cat /etc/redhat-release 
CentOS Linux release 7.9.2009 (Core)

3、环境要求

一台或多台机器，操作系统CentOS 7
硬件配置：内存2GB或2G+，CPU 2核或CPU 2核+；
集群内各个机器之间能相互通信；
集群内各个机器可以访问外网，需要拉取镜像；
禁止swap分区；

二、环境配置

1、关闭防火墙

[root@localhost ~]# systemctl stop firewalld 
[root@localhost ~]# systemctl disable firewalld

2、关闭selinux

#永久 
[root@localhost ~]# sed -i 's/enforcing/disabled/' /etc/selinux/config 

#临时 
[root@localhost ~]# setenforce 0

3、关闭swap

k8s禁止虚拟内存以提高性能

#永久 
[root@localhost ~]# sed -ri 's/.*swap.*/#&/' /etc/fstab 

#临时 
[root@localhost ~]# swapoff -a

4、配置hosts

每台机器都要配置

vim /etc/hosts 

192.168.209.132 k8s-master1 
192.168.209.133 k8s-master2 
192.168.209.136 k8s-master3 
192.168.209.139 k8s-node 
192.168.209.140 k8s-vip

5、修改主机名

每台机器都修改

[root@localhost ~]# hostnamectl set-hostname k8s-master1 

#查看 
[root@localhost ~]# more /etc/hostname more /etc/hostnam

6、网桥参数修改

cat > /etc/sysctl.d/k8s.conf << EOF 
net.bridge.bridge-nf-call-ip6tables = 1 
net.bridge.bridge-nf-call-iptables = 1 
EOF 

#生效 
sysctl --system

7、同步时间

yum install ntpdate -y 
ntpdate time.windows.com

二、安装keepalived 和 haproxy

所有master节点都安装

1 、安装相关包和keepalived

yum install -y conntrack-tools libseccomp libtool-ltdl 
yum install -y keepalived

2、配置master节点

master1节点配置

cat > /etc/keepalived/keepalived.conf <<EOF 
! Configuration File for keepalived

global_defs {
   router_id k8s
}

vrrp_script check_haproxy {
    script "killall -0 haproxy"
    interval 3
    weight -2
    fall 10
    rise 2
}

vrrp_instance VI_1 {
    state MASTER 
    interface ens33 
    virtual_router_id 51
    priority 250
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass ceb1b3ec013d66163d6ab
    }
    virtual_ipaddress {
        192.168.209.140
    }
    track_script {
        check_haproxy
    }

}
EOF

master2节点配置

cat > /etc/keepalived/keepalived.conf <<EOF 
! Configuration File for keepalived

global_defs {
   router_id k8s
}

vrrp_script check_haproxy {
    script "killall -0 haproxy"
    interval 3
    weight -2
    fall 10
    rise 2
}

vrrp_instance VI_1 {
    state BACKUP 
    interface ens33 
    virtual_router_id 51
    priority 200
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass ceb1b3ec013d66163d6ab
    }
    virtual_ipaddress {
        192.168.209.140
    }
    track_script {
        check_haproxy
    }

}
EOF

priority 优先级，每台机器递减

virtual_ipaddress vip节点ip地址

3、启动和检查

在所有master节点都执行

# 启动keepalived 
systemctl start keepalived.service 

# 设置开机启动 
systemctl enable keepalived.service 

# 查看启动状态 
systemctl status keepalived.service

4、部署haproxy

所有master节点

4.1 安装

yum install -y haproxy

4.2 配置

两台master节点的配置均相同，配置中声明了后端代理的两个master节点服务器，指定了haproxy运行的端口为16443等，因此16443端口为集群的入口

cat > /etc/haproxy/haproxy.cfg << EOF
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2
    
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon 
       
    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------  
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000
#---------------------------------------------------------------------
# kubernetes apiserver frontend which proxys to the backends
#--------------------------------------------------------------------- 
frontend kubernetes-apiserver
    mode                 tcp
    bind                 *:16443
    option               tcplog
    default_backend      kubernetes-apiserver    
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend kubernetes-apiserver
    mode        tcp
    balance     roundrobin
    server      k8s-master1   192.168.209.132:6443 check
    server      k8s-master2   192.168.209.133:6443 check
#---------------------------------------------------------------------
# collection haproxy statistics message
#---------------------------------------------------------------------
listen stats
    bind                 *:1080
    stats auth           admin:awesomePassword
    stats refresh        5s
    stats realm          HAProxy\ Statistics
    stats uri            /admin?stats
EOF

4.3、启动和检查

两台master都启动

# 设置开机启动 
systemctl enable haproxy 

# 开启haproxy 
systemctl start haproxy 

# 查看启动状态 
systemctl status haproxy

检查端口

netstat -lntup|grep haproxy

三、docker安装

所有节点均执行

1、更新yum源

#可选 
yum install wget -y 
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo

2、安装docker

yum install docker-ce-19.03.13 -y

3、配置开机自启

systemctl enable docker.service

4、配置加速器

配置daemon.json文件

[root@localhost ~]# mkdir -p /etc/docker 
[root@localhost ~]# tee /etc/docker/daemon.json <<-'EOF' 
{ 
  "registry-mirrors": ["https://v16stybc.mirror.aliyuncs.com"] 
} 
EOF

重启服务

[root@localhost ~]# systemctl daemon-reload 
[root@localhost ~]# systemctl restart docker

5、docker查看命令

查看docker状态

systemctl status docker.service

查看当前已下载的镜像

docker images

拉取镜像

docker pull hello-world

运行镜像

[root@localhost ~]# docker run hello-world

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

如此docker安装成功。

四、k8s安装

1、添加k8s的阿里云YUM源

所有节点均执行

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

2、安装 kubeadm，kubelet 和 kubectl

所有节点均执行

[root@localhost ~]# yum install kubelet-1.19.4 kubeadm-1.19.4 kubectl-1.19.4 -y

3、开机自启

所有节点均执行

[root@localhost ~]# systemctl enable kubelet.service

4、查看是否安装成功

所有节点均执行

yum list installed | grep kubelet 
yum list installed | grep kubeadm 
yum list installed | grep kubectl

5、修改Cgroup Driver

所有节点均执行

修改daemon.json，新增‘“exec-opts”: [“native.cgroupdriver=systemd”’

[root@localhost ~]# vim /etc/docker/daemon.json 
{ 
  "registry-mirrors": ["https://v16stybc.mirror.aliyuncs.com"], 
  "exec-opts": ["native.cgroupdriver=systemd"] 
}

重新加载docker

[root@localhost ~]# systemctl daemon-reload 
[root@localhost ~]# systemctl restart docker

修改cgroup driver是为了消除kubeadm init告警：

[WARNING IsDockerSystemdCheck]: detected “cgroupfs” as the Docker cgroup driver. The recommended driver is “systemd”. Please follow the guide at  https://kubernetes.io/docs/setup/cri/

6、kubelet命令补全

所有节点均执行

echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile 

#配置生效 
source /etc/profile

这一步不执行的话，kubectl get nodes会报错

[root@localhost ~]# kubectl get nodes The connection to the server localhost:8080 was refused - did you specify the right host or port?

7、部署Kubernetes Master

7.1 创建kubeadm配置文件

# VIP查看

[root@master01 ~]# ip a

vip 在master1上

在具有vip的master上操作，这里为master1

mkdir /usr/local/kubernetes/manifests -p
cd /usr/local/kubernetes/manifests/
vi kubeadm-config.yaml

apiServer:
  certSANs:
    - k8s-master1
    - k8s-master2
    - k8s-vip
    - 192.168.209.132
    - 192.168.209.133
    - 192.168.209.140
    - 127.0.0.1
  extraArgs:
    authorization-mode: Node,RBAC
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "k8s-vip:16443"
controllerManager: {}
dns: 
  type: CoreDNS
etcd:
  local:    
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.19.4
networking: 
  dnsDomain: cluster.local  
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.1.0.0/16
scheduler: {}

7.2 在master1节点执行

kubeadm init --config kubeadm-config.yaml --v=2

按照提示配置环境变量，使用kubectl工具

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ kubectl get nodes
$ kubectl get pods -n kube-system

按照提示保存以下内容，一会要使用

# master节点加入集群
kubeadm join 192.168.209.140:16443 --token anidea.x1tfsqqumxx1kp5a \
    --discovery-token-ca-cert-hash sha256:a1429172ba9feb4516e056cb973dc5bd157e405e3376aa7ba938cbf46a4e7680 \
    --control-plane --v=2
# node节点加入集群
kubeadm join 192.168.209.140:16443 --token anidea.x1tfsqqumxx1kp5a \
    --discovery-token-ca-cert-hash sha256:a1429172ba9feb4516e056cb973dc5bd157e405e3376aa7ba938cbf46a4e7680 --v=2

查看集群状态

kubectl get cs kubectl get pods -n kube-system

8、安装集群网络

具有vip的master节点执行

[root@localhost ~]# vim kube-flannel.yml

---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp.flannel.unprivileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
  privileged: false
  volumes:
  - configMap
  - secret
  - emptyDir
  - hostPath
  allowedHostPaths:
  - pathPrefix: "/etc/cni/net.d"
  - pathPrefix: "/etc/kube-flannel"
  - pathPrefix: "/run/flannel"
  readOnlyRootFilesystem: false
  # Users and groups
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  # Privilege Escalation
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  # Capabilities
  allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
  defaultAddCapabilities: []
  requiredDropCapabilities: []
  # Host namespaces
  hostPID: false
  hostIPC: false
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  # SELinux
  seLinux:
    # SELinux is unused in CaaSP
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
rules:
- apiGroups: ['extensions']
  resources: ['podsecuritypolicies']
  verbs: ['use']
  resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.13.0
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.13.0
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN", "NET_RAW"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
      - name: run
        hostPath:
          path: /run/flannel
      - name: cni
        hostPath:
          path: /etc/cni/net.d
      - name: flannel-cfg
        configMap:
          name: kube-flannel-cfg

 #执行 
[root@localhost ~]# kubectl apply -f kube-flannel.yml

执行完毕稍等几分钟左右，再次查看通过kubectl get nodes查看

[root@localhost ~]# kubectl get nodes 
NAME STATUS ROLES AGE VERSION 
k8s-master1 Ready master 32m v1.19.4 
k8s-node1 Ready <none> 27m v1.19.4 
k8s-node2 Ready <none> 8m23s v1.19.4

检查

kubectl get pods -n kube-system

9、master2节点加入集群

9.1、复制密钥及相关文件

从master1复制密钥及相关文件到master2

# ssh root@192.168.209.133 mkdir -p /etc/kubernetes/pki/etcd

# scp /etc/kubernetes/admin.conf root@192.168.209.133:/etc/kubernetes
   
# scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@192.168.209.133:/etc/kubernetes/pki
   
# scp /etc/kubernetes/pki/etcd/ca.* root@192.168.209.133:/etc/kubernetes/pki/etcd

9.2、master2加入集群

执行在master1上init后输出的join命令,需要带上参数`--control-plane`表示把master控制节点加入集群

kubeadm join 192.168.209.140:16443 --token anidea.x1tfsqqumxx1kp5a \
    --discovery-token-ca-cert-hash sha256:a1429172ba9feb4516e056cb973dc5bd157e405e3376aa7ba938cbf46a4e7680 \
    --control-plane --v=2

检查状态

kubectl get node 

kubectl get pods --all-namespaces

10、node节点加入集群

在node1上执行

向集群添加新节点，执行在kubeadm init输出的kubeadm join命令：

kubeadm join 192.168.209.140:16443 --token anidea.x1tfsqqumxx1kp5a  --discovery-token-ca-cert-hash sha256:a1429172ba9feb4516e056cb973dc5bd157e405e3376aa7ba938cbf46a4e7680 --v=2

报错

[root@k8s-node ~]#I0210 16:55:53.000952   80891 token.go:215] [discovery] Failed to request cluster-info, will try again: Get "https://192.168.209.140:16443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": dial tcp 192.168.209.140:16443: connect: connection refused

重启node节点 reboot

11、集群网络重新安装，因为添加了新的node节点

kubectl delete -f kube-flannel.yml 

kubectl apply -f kube-flannel.yml

检查状态

kubectl get node 

kubectl get pods --all-namespaces

12、测试kubernetes集群

k8s环境安装成功，拉取nginx镜像进行测试。

#创建deploy
[root@localhost ~]# kubectl create deployment nginx --image=nginx
#开放端口
[root@localhost ~]# kubectl expose deployment nginx --port=80 --target-port=80 --type=NodePort
#查看端口
kubectl get pod,svc 或者 kubectl get service
[root@localhost ~]# kubectl get service
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP        33m
nginx        NodePort    10.104.117.63   <none>        80:32297/TCP   9s

32297是可以访问的端口。

访问地址：http://NodeIP:Port

K8S/Kubernetes

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐

【深度】阿里巴巴万级规模 K8s 集群全局高可用体系之美

作者 | 韩堂、柘远、沉醉来源 | 阿里巴巴云原生公众号前言台湾作家林清玄在接受记者采访的时候，如此评价自己 30 多年写作生涯：“第一个十年我才华横溢，‘贼光闪现’，令周边黯然失色；第二个十年，我终于‘宝光现形’，不再去抢风头，反而与身边的美丽相得益彰；进入第三个十年，繁华落尽见真醇，我进入了‘醇光初现’的阶段，真正体味到了境界之美”。长夜有穷，真水无香。领略过了 K8s“身在江

K8S/Kubernetes

如何基于 K8s 构建下一代 DevOps 平台？

作者 | 孙健波（天元）导读：当前云原生 DevOps 体系现状如何？面临哪些挑战？如何通过 OAM 解决云原生 DevOps 场景下的诸多问题？云原生开发应用模型 OAM(Open Application Model) 社区核心成员孙健波将为大家一一解答，并分享如何基于 OAM 和 Kubernetes 打造无限能力的下一代 DevOps 平台。什么是 DevOps？为什么基于 Kub