前言

两台机器尽量实现k8s节点的高可用(宕机一台也能用)。
官方的堆叠拓扑和外部etcd拓扑至少都要三台机器,且使用外部etcd拓扑时,再已经部署etcd的节点上无法运行kubeadm init --config config.yaml

相关内容:

https://github.com/kubernetes/kubeadm/issues/1107

计划采用方案,两台机器上二进制形式安装etcd(不要组集群),共用同一套ssl证书,两个节点都是master, 刚安装完成后使用etcd1,etcd2定时同步etcd1的数据,当节点1宕机后,节点2使用etcd2, 恢复后…

这个方案只能做到尽量高可用,还需要一些备份,检测等脚本的配合;还有很多细节,如备份文件校验,备份文件出错后取舍,没办法,实际场景需要保证两台节点时的高可用。

etcd相关安装

etcd安装脚本
ETCD_VER=v3.5.13

# choose either URL
GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://mirror.ghproxy.com/https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GOOGLE_URL}

rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
rm -rf /tmp/etcd-download-test && mkdir -p /tmp/etcd-download-test

curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /tmp/etcd-download-test --strip-components=1
rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz

/tmp/etcd-download-test/etcd --version
/tmp/etcd-download-test/etcdctl version
/tmp/etcd-download-test/etcdutl version
将etcd 和 /etcdctl 都复制到/usr/bin 目录
mv /tmp/etcd-download-test/etcd /usr/bin/
mv /tmp/etcd-download-test/etcdctl /usr/bin/
mv /tmp/etcd-download-test/etcdutl /usr/bin/
将etcd 部署为systemd的服务
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target

[Service]
EnvironmentFile=/etc/etcd/etcd.conf
ExecStart=/usr/bin/etcd
Restart=always

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
# 开机自启
systemctl enable etcd
配置etcd证书

官方建议用cfssl,但是我们的操作系统默认携带openssl,这里采用openssl

创建根证书

由于我们测试环境没有统一的ca认证,所以需要使用自签名证书来完成安全配置,etcd 和 Kubernets需要继续根证书来创建自己的ca证书。根证书即签发机构。

mkdir -p  /etc/kubernetes/pki

cd  /etc/kubernetes/pki
openssl genrsa -out ca.key 2048

openssl req -x509 -new -nodes -key ca.key -subj "/CN=192.168.44.140" -days 36500 -out ca.crt
创建etcd的服务端ca证书

所有的证书全部在mater01节点创建,然后拷贝到master02对应的目录。

mkdir -p /root/etcd/
cd  /root/etcd/

cat > etcd_ssl.cnf <<EOF
[req]
req_extensions=v3_req
distinguished_name=req_distinguished_name
[req_distinguished_name]
[v3_req]
basicConstraints=CA:FALSE
keyUsage=nonRepudiation, digitalSignature, keyEncipherment
subjectAltName=@alt_names
[alt_names]
IP.1=192.168.44.140
IP.2=192.168.44.141
EOF

下面开始根据配置文件创建etcd服务端CA证书

cd /root/etcd/
openssl genrsa -out etcd_server.key 2048
openssl req -new -key etcd_server.key -config etcd_ssl.cnf -subj "/CN=etcd-server" -out etcd_server.csr

openssl x509 -req -in etcd_server.csr -CA /etc/kubernetes/pki/ca.crt -CAkey /etc/kubernetes/pki/ca.key -CAcreateserial -days 36500 -extensions v3_req -extfile etcd_ssl.cnf -out etcd_server.crt

mkdir -p /etc/etcd/pki
cp etcd_server.crt     /etc/etcd/pki/
cp etcd_server.key   /etc/etcd/pki/
创建etcd客户端ca证书

所有的证书全部在mater01节点创建,然后拷贝到master02对应的目录。

这个主要是给后续kube-apiserver链接etcd时使用

openssl genrsa -out etcd_client.key 2048
openssl req -new -key etcd_client.key -config etcd_ssl.cnf -subj "/CN=etcd-client" -out etcd_client.csr
openssl x509 -req -in etcd_client.csr -CA /etc/kubernetes/pki/ca.crt -CAkey /etc/kubernetes/pki/ca.key -CAcreateserial -days 36500 -extensions v3_req -extfile etcd_ssl.cnf -out etcd_client.crt
cp etcd_client.key /etc/etcd/pki/
cp etcd_client.crt /etc/etcd/pki/
etcd配置文件
  1. 192.168.44.140 的/etc/etcd/etcd.conf
ETCD_NAME=etcd1
ETCD_DATA_DIR=/data/paas/etcd

ETCD_CERT_FILE=/etc/etcd/pki/etcd_server.crt
ETCD_KEY_FILE=/etc/etcd/pki/etcd_server.key
ETCD_TRUSTED_CA_FILE=/etc/kubernetes/pki/ca.crt
ETCD_CLIENT_CERT_AUTH=true
ETCD_LISTEN_CLIENT_URLS=https://192.168.44.140:2379
ETCD_ADVERTISE_CLIENT_URLS=https://192.168.44.140:2379
ETCD_PEER_CERT_FILE=/etc/etcd/pki/etcd_server.crt
ETCD_PEER_KEY_FILE=/etc/etcd/pki/etcd_server.key
ETCD_PEER_TRUSTED_CA_FILE=/etc/kubernetes/pki/ca.crt
ETCD_LISTEN_PEER_URLS=https://192.168.44.140:2380
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.44.140:2380

ETCD_INITIAL_CLUSTER_TOKEN=etcd-single-1
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.44.140:2380"
ETCD_INITIAL_CLUSTER_STATE=new
systemctl restart etcd 
  1. 192.168.44.141 的/etc/etcd/etcd.conf
# 复制证书
mkdir /etc/kubernetes/pki
scp root@192.168.44.140:/etc/kubernetes/pki/ /etc/kubernetes/pki/
mkdir -p /etc/etcd/pki
scp root@192.168.44.140:/etc/etcd/pki /etc/etcd/pki
ETCD_NAME=etcd2
ETCD_DATA_DIR=/data/paas/etcd

ETCD_CERT_FILE=/etc/etcd/pki/etcd_server.crt
ETCD_KEY_FILE=/etc/etcd/pki/etcd_server.key
ETCD_TRUSTED_CA_FILE=/etc/kubernetes/pki/ca.crt
ETCD_CLIENT_CERT_AUTH=true
ETCD_LISTEN_CLIENT_URLS=https://192.168.44.141:2379
ETCD_ADVERTISE_CLIENT_URLS=https://192.168.44.141:2379
ETCD_PEER_CERT_FILE=/etc/etcd/pki/etcd_server.crt
ETCD_PEER_KEY_FILE=/etc/etcd/pki/etcd_server.key
ETCD_PEER_TRUSTED_CA_FILE=/etc/kubernetes/pki/ca.crt
ETCD_LISTEN_PEER_URLS=https://192.168.44.141:2380
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.44.141:2380

ETCD_INITIAL_CLUSTER_TOKEN=etcd-single-2
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.44.141:2380"
ETCD_INITIAL_CLUSTER_STATE=new
查看成员
etcdctl --cacert=/etc/kubernetes/pki/ca.crt --cert=/etc/etcd/pki/etcd_client.crt --key=/etc/etcd/pki/etcd_client.key --endpoints=https://192.168.44.140:2379,https://192.168.44.141:2379 endpoint health

准备部署k8s双master

kube-vip静态pod配置VIP
# 本地加载镜像
docker load -i ......

export KVVERSION=v0.6.3

alias kube-vip="docker run --network host --rm ghcr.io/kube-vip/kube-vip:$KVVERSION"

export VIP=192.168.44.100

export INTERFACE=ens33

mkdir -p /etc/kubernetes/manifests


kube-vip manifest pod \
    --interface $INTERFACE \
    --address $VIP \
    --controlplane \
    --services \
    --arp \
    --leaderElection | tee /etc/kubernetes/manifests/kube-vip.yaml

可以修改kube-vip的镜像拉取规则, Always -> IfNotPresent

image: ghcr.io/kube-vip/kube-vip:v0.6.3
imagePullPolicy: Always

k8s 使用本地镜像的时候 - imagePullPolicy:Nevel_imagepullpolicy: ifnotpresent-CSDN博客

kubeadm init config
vim external-etcd-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.44.140
  bindPort: 6443
nodeRegistration: 
  criSocket: unix:///var/run/cri-dockerd.sock
  imagePullPolicy: IfNotPresent 
  taints: null
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
apiServer:
  timeoutForControlPlane: 4m0s
kubernetesVersion: 1.27.3
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16
controlPlaneEndpoint: "192.168.44.100:6443"
etcd:
  external:
    endpoints:
      - https://192.168.44.140:2379
    caFile: /etc/kubernetes/pki/ca.crt
    certFile: /etc/etcd/pki/etcd_client.crt
    keyFile: /etc/etcd/pki/etcd_client.key
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
kubeadm init
kubeadm init --config ./external-etcd-config.yaml --upload-certs  --v=5
kubeadm join
kubeadm join 192.168.44.100:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:c5d4a6fbfa6b1ce1fe21e9bd5c1516e04b1735cd1d1a033720d4d8a783e9287e \
--control-plane --certificate-key e72e1d3b1986a1ee2a404e3e07e7602e858f4c84767d0fb655a6754c990303f7 \
--cri-socket unix:///var/run/cri-dockerd.sock \
--v=5
安装网络插件跳过

笔者采用的 Antrea

关闭141的etcd看apiserver是否可用
systemctl stop etcd
kubectl get pod -A

测试通过

部署nginx测试
  1. 导入镜像

  2. kubectl create namespace my-namespace

  3. 设置nginxConfig.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "2"
  generation: 2
  labels:
    app: nginx1
  name: nginx1
  namespace: my-namespace
spec:
  progressDeadlineSeconds: 600
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: nginx1
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: nginx1
    spec:
      containers:
      - image: nginx:stable-perl
        imagePullPolicy: IfNotPresent
        name: nginx1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
        # 允许在master节点上操作
      - effect: NoSchedule
        key: node-role.kubernetes.io/control-plane
  1. kubectl apply -f nginxConfig.yaml

  2. 暴露端口

    vim nodeport.yaml
    
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx-service
      namespace: my-namespace
    spec:
      selector:
        app: nginx1  # 选择与 Deployment 中 Pod 匹配的标签
      ports:
        - nodePort: 18033   # 宿主机端口
          protocol: TCP
          port: 80   # Deployment 中 Pod 的端口
          targetPort: 80  # Deployment 中 Pod 的端口
      type: NodePort  # 这里使用 LoadBalancer 类型,根据实际情况选择合适的类型
    

    kubectl apply -f nodeport.yaml

  3. 访问 192.168.44.140(192.168.44.100):18033

    ok了

检测etcd宕机一台后切换etcd

1.141同步140数据

rsync -av root@192.168.44.140:/data/paas/etcd/ /data/paas/etcd/
  1. 关闭140 etcd
systemctl stop etcd

# apiServer不可用
kubectl get pod -A
  1. 修改apiServer静态pod 的yaml
vim /etc/kubernetes/manifests/kube-apiserver.yaml
    - --etcd-servers=https://192.168.44.140:2379  =======> https://192.168.44.141:2379
  1. 重启kubelet
systemctl restart kubelet
  1. 检测apiserver是否可用
kubectl get pod -A

ok

后记

经过一系列的测试(这里只举例小部分测试),该方案是可行的,但是毕竟只有两台节点,相对于三台Master堆叠拓扑或外部etcd拓扑(安装看前面笔记,至少三台节点)还是有差距的,而且还要一系列的外部脚本配合。

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐