K8S高可用集群部署

1.环境准备

集群规划:三台ETCD服务器,三台主服务器,若干台节点

配置信息:

节点配置信息表
节点主机名IPOSCPU核数  内存大小   
master,ETCDk8smaster01.xxx.xxx.com10.66.10.11centos 7xXG
master,ETCDk8smaster02.xxx.xxx.com10.66.10.12centos 7xxG
master,ETCDk8smaster03.xxx.xxx.com 10.66.10.13centos 7xxG
节点k8snode01.xxx.xxx.com10.66.10.21centos 7xxG
节点k8snode02.xxx.xxx.com10.66.10.22centos 7xxG
节点k8snode03.xxx.xxx.com10.66.10.23centos 7xxG

负载均衡:10.66.10.10对应的地址为三台master(私有环境可以选择keepalive)

镜像仓库harbor:10.66.10.09

备注:本此是以阿里云服务器部署为例,为方便演示,master与etcd同部署在一台机器上;

若实际环境etcd部署在其他服务器上,还需根据实际情况修改接下来所用到的配置信息;

若没有私有的镜像仓库,可先将所需要用到的镜像下载至各个节点,接下来的配置文件也需要做相应的修改;

 

 

2.系统基础配置(所有节点)

编辑/etc/hosts文件,将配置信息表内的节点信息写入hosts:

10.66.10.11 k8smaster01.xxx.xxx.com
10.66.10.12 k8smaster02.xxx.xxx.com
10.66.10.13 k8smaster03.xxx.xxx.com
10.66.10.10 k8svip.xxx.xxx.com

10.66.10.21 k8snode01.xxx.xxx.com
10.66.10.22 k8snode02.xxx.xxx.com
10.66.10.23 k8snode03.xxx.xxx.com

10.66.10.09 hub.xxx.xxx.com

 

关闭防火墙

systemctl stop firewalld && systemctl disable firewalld

 

关闭selinux

编辑/etc/selinux/config文件,将SELINUX=enforcing改为SELINUX=disabled;或 执行 setenforce 0

 

关闭swap 

echo "swapoff -a" >> /etc/rc.local

修改/etc/sysctl.conf文件

sed -i '/ip_forward/ s/0/1/g' /etc/sysctl.conf
sed -i '/tables/ s/0/1/g' /etc/sysctl.conf

 

3.相关组件配置(所有节点)

安装docker

docker版本 17.06.2-ce (部署过程略过)

Yum安装即可

 

添加kubernetes-yum源

执行

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.cloud.aliyuncs.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.cloud.aliyuncs.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.cloud.aliyuncs.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

安装kubernetes部署组件

yum install kubeadm-1.10.5-0 kubectl-1.10.5 kubelet-1.10.5 kubernetes-cni socat -y

 

修改kubelet的配置文件

vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

添加KUBELET_SYSTEM_PODS_ARGS的配置参数 --pod-infra-container-image=hub.xxx.xxx.com/library/pause-amd64:3.1  (若没有私有仓库,不添加)

核对KUBELET_CGROUP_ARGS的配置参数--cgroup-driver的信息,该参数默认是systemd;核对是否与docker 中的信息一致(执行 docker info | grep Cgroup 查看),如果不一致,需修改为 --cgroup-driver=cgroupfs

添加--authorization-token-webhook,该配置用于之后的prometheus的服务启动认证

添加--enable-controller-attach-detach=false,该配置用于阿里云上的oss插件flexvolume

上图为修改后

执行

systemctl daemon-reload && systemctl enable kubelet && systemctl start kubelet

 

4.部署ETCD集群(所有master节点)

etcd安装这里以yum安装为例 : 执行 yum install etcd -y

制作etcd证书,这里以cfssl为例:

执行以下操作

cd /usr/local/bin

wget -O cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget -O cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget -O cfssl-certinfo https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64

chmod +x /usr/local/bin/cfssl*

接下来就能使用cfssl命令制作证书;

mkdir  /data/ssl

cd /data/ssl

cfssl print-defaults config > ca-config.json
cfssl print-defaults csr > ca-csr.json

 

编辑 ca-config.json 如下   可按照需要修改过期时限

编辑ca-csr.json  如下

 

执行

cfssl gencert -initca ca-csr.json | cfssljson -bare ca

编辑 kubernetes-csr.json 如下

备注:上图截图中的内容为示例,hosts内容并非该集群的信息

执行

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes

至此,证书生成完毕;在其余etcd节点上新建/data/ssl目录,将该文件下的*.pem拷贝至其余的master的/data/ssl目录下;

 

编辑 /etc/etcd/etcd.conf 如下

ETCD_DATA_DIR="/data/apps/etcd"
ETCD_LISTEN_PEER_URLS="https://10.66.10.11:2380"

ETCD_LISTEN_CLIENT_URLS="https://10.66.10.11:2379"

ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.66.10.11:2380"

ETCD_ADVERTISE_CLIENT_URLS="https://10.66.10.11:2379"

ETCD_NAME="etcd01"
ETCD_INITIAL_CLUSTER="etcd01=https://10.66.10.11:2380,etcd02=https://10.66.10.12:2380,etcd03=https://10.66.10.13:2380"
ETCD_INITIAL_CLUSTER_STATE="new"

ETCD_CERT_FILE="/data/ssl/kubernetes.pem"
ETCD_KEY_FILE="/data/ssl/kubernetes-key.pem"
ETCD_TRUSTED_CA_FILE="/data/ssl/ca.pem"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_PEER_CERT_FILE="/data/ssl/kubernetes.pem"
ETCD_PEER_KEY_FILE="/data/ssl/kubernetes-key.pem"
ETCD_PEER_TRUSTED_CA_FILE="/data/ssl/ca.pem"
ETCD_PEER_CLIENT_CERT_AUTH="true"

ETCD_AUTO_TLS="true"
ETCD_PEER_AUTO_TLS="true"

ETCD_DEBUG="true"
ETCD_LOG_PACKAGE_LEVELS="etcdserver=WARNING,security=DEBUG"

备注:etcd参数信息需根据不同节点的实际信息做修改;需要修改注意的地方用下划线标出;

 


执行

mkdir /data/apps/etcd
chown etcd:etcd -R /data/apps/etcd
chmod 755 /data/ssl/*.pem

最后重启etcd服务

systemctl restart etcd 

在其余etcd节点上做上述同样操作(证书区域跳过)

验证集群是否正常启动执行

etcdctl --ca-file=/data/ssl/ca.pem --cert-file=/data/ssl/kubernetes.pem --key-file=/data/ssl/kubernetes-key.pem --endpoints=https://10.66.10.11:2379,https://10.66.10.12:2379,https://10.66.10.13:2379 cluster-health

 

5.部署master

先选择一台master服务器执行以下操作,以10.66.10.11为例;

编辑kubeadm-config.yaml文件如下

apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
imageRepository: "hub.xxx.xxx.com/library"
featureGates:
    CoreDNS: true
networking:
    podSubnet: "10.244.0.0/16"
kubernetesVersion: "v1.10.5"
apiServerCertSANs:
  - "k8svip.xxx.xxx.com"
  - "k8smaster01.xxx.xxx.com"
  - "k8smaster02.xxx.xxx.com"
  - "k8smaster03.xxx.xxx.com"
  - "10.66.10.11"
  - "10.66.10.12"
  - "10.66.10.13"
  - "10.66.10.10"
api:
ControlPlaneEndpoint: "10.66.10.10:6443"
etcd:
    endpoints:
      - https://10.66.10.11:2379
      - https://10.66.10.12:2379
      - https://10.66.10.13:2379
caFile: /data/ssl/ca.pem
certFile: /data/ssl/kubernetes.pem
keyFile: /data/ssl/kubernetes-key.pem
dataDir: /data/apps/etcd

备注:若没有私有仓库,参数imageRepository可不添加;hosts内容并非该集群的信息;

kubeadm init --config kubeadm-config.yaml

得到以下结果

 

此时root用户还不能使用kubelet控制集群,还需要配置该环境变量。

执行 

echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile    &&   source ~/.bash_profile  

使用kubectl验证pod是否正常启动 执行

kubectl get pod --all-namespaces

在没有部署flannel前,coredns没有启动是正常的;

kube-flannel.yaml

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: flannel
rules:
  - apiGroups:
      - ""
    resources:
      - pods
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - nodes
    verbs:
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - nodes/status
    verbs:
      - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "type": "flannel",
      "delegate": {
        "isDefaultGateway": true
      }
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube-flannel-ds
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      hostNetwork: true
      nodeSelector:
        beta.kubernetes.io/arch: amd64
      tolerations:
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: hub.xxx.xxx.com/library/flannel:v0.9.1-amd64
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conf
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image:  hub.xxx.xxx.com/library/flannel:v0.9.1-amd64
        command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr" ]
        securityContext:
          privileged: true
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg

执行 kubectl apply -f kube-flannel.yaml

再次检查pod运行状态(执行完命令大概一分钟左右即可看到running状态)

 

备注:需要用到的镜像有以下这些:

gcr.io/google_containers/kube-apiserver-amd64:v1.10.5

gcr.io/google_containers/kube-proxy-amd64:v1.10.5

gcr.io/google_containers/kube-scheduler-amd64:v1.10.5

gcr.io/google_containers/kube-controller-manager-amd64:v1.10.5

k8s.gcr.io/pause-amd64:3.1

quay.io/coreos/flannel:v0.9.1-amd64

coredns/coredns:v1.0.6   (该镜像目前没有配置参数,如果使用的是私有仓库,master并不会区在私有仓库中拉取该镜像,需要手动将该镜像从仓库中拉下来再tag成coredns/coredns:v1.0.6)

 

至此10.66.10.11上的master已经部署完成;

需要将该台机器/etc/kubernetes/pki/  文件夹下的所有文件copy到其他两台master相同路径的文件夹下;

接下来在其他的master上执行相同操作;

master集群已经部署完成;

 

6.部署node

使用之前master三台中的任意一个token,执行

 kubeadm join master的IP地址 之前master生成的token

如下图

完成后,在其他node上执行相同操作即可;

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐