概述

公司因项目需要使用K8S部署ETCD集群供其他业务调用,网上搜索了解了下,一般K8S搭建ETCD集群大部分都是使用Etcd Operator来搭建。但是公司的项目运行在离线ARM架构平台,直接使用网上Etcd Operator代码,他们提供的镜像都是x86_64架构,经过Opeartor编译等尝试,最后都以失败告终。最后在Github上面找到一位大佬的开源代码,经过一通折腾,总算成功部署上了,故而博文记录,用于备忘

测试环境

minikube
Client Version: v1.29.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.3

yaml配置

service.yaml

apiVersion: v1
kind: Service
metadata:
  name: etcdsrv
spec:
  clusterIP: None
  publishNotReadyAddresses: true
  ports:
    - name: etcd-client-port
      protocol: TCP
      port: 2379
      targetPort: 2379
  selector:
    name: etcd-operator

cluster.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: etcd0
spec:
  replicas: 1
  selector:
    matchLabels:
      name: etcd-operator
  serviceName: etcdsrv
  template:
    metadata:
      labels:
        name: etcd-operator
    spec:
      containers:
      - name: app
        image: quay.io/coreos/etcd:v3.5.9
        imagePullPolicy: Always
        volumeMounts:
        - mountPath: /data/etcd_data
          name: etcd-volume
        command:
          - /usr/local/bin/etcd
          - --data-dir
          - /data/etcd_data
          - --auto-compaction-retention
          - '1'
          - --quota-backend-bytes
          - '8589934592'
          - --listen-client-urls
          - http://0.0.0.0:2379
          - --advertise-client-urls
          - http://etcd0-0.etcdsrv:2379
          - --listen-peer-urls
          - http://0.0.0.0:2380
          - --initial-advertise-peer-urls
          - http://etcd0-0.etcdsrv:2380
          - --initial-cluster-token
          - etcd-cluster
          - --initial-cluster
          - etcd0=http://etcd0-0.etcdsrv:2380,etcd1=http://etcd1-0.etcdsrv:2380,etcd2=http://etcd2-0.etcdsrv:2380
          - --initial-cluster-state
          - new
          - --enable-pprof
          - --election-timeout
          - '5000'
          - --heartbeat-interval
          - '250'
          - --name
          - etcd0
          - --logger
          - zap
      volumes:
      - name: etcd-volume
        hostPath:
          path: /var/tmp/etcd0
          type: Directory
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: etcd1
spec:
  replicas: 1
  selector:
    matchLabels:
      name: etcd-operator
  serviceName: etcdsrv #该名字需要和Service的name字段一致
  template:
    metadata:
      labels:
        name: etcd-operator
    spec:
      containers:
        - name: app
          image: quay.io/coreos/etcd:v3.5.9
          imagePullPolicy: Always
          volumeMounts:
            - mountPath: /data/etcd_data #该目录名称注意要对应修改
              name: etcd-volume
          command:
            - /usr/local/bin/etcd
            - --data-dir
            - /data/etcd_data 
            - --auto-compaction-retention
            - '1'
            - --quota-backend-bytes
            - '8589934592'
            - --listen-client-urls
            - http://0.0.0.0:2379
            - --advertise-client-urls
            - http://etcd1-0.etcdsrv:2379
            - --listen-peer-urls
            - http://0.0.0.0:2380
            - --initial-advertise-peer-urls
            - http://etcd1-0.etcdsrv:2380
            - --initial-cluster-token
            - etcd-cluster
            - --initial-cluster
            - etcd0=http://etcd0-0.etcdsrv:2380,etcd1=http://etcd1-0.etcdsrv:2380,etcd2=http://etcd2-0.etcdsrv:2380
            - --initial-cluster-state
            - new
            - --enable-pprof
            - --election-timeout
            - '5000'
            - --heartbeat-interval
            - '250'
            - --name
            - etcd1
            - --logger
            - zap
      volumes:
        - name: etcd-volume
          hostPath:
            path: /var/tmp/etcd1
            type: Directory
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: etcd2
spec:
  replicas: 1
  selector:
    matchLabels:
      name: etcd-operator
  serviceName: etcdsrv
  template:
    metadata:
      labels:
        name: etcd-operator
    spec:
      containers:
        - name: app
          image: quay.io/coreos/etcd:v3.5.9
          imagePullPolicy: Always
          volumeMounts:
            - mountPath: /data/etcd_data
              name: etcd-volume
          command:
            - /usr/local/bin/etcd
            - --data-dir
            - /data/etcd_data
            - --auto-compaction-retention
            - '1'
            - --quota-backend-bytes
            - '8589934592'
            - --listen-client-urls
            - http://0.0.0.0:2379
            - --advertise-client-urls
            - http://etcd2-0.etcdsrv:2379
            - --listen-peer-urls
            - http://0.0.0.0:2380
            - --initial-advertise-peer-urls
            - http://etcd2-0.etcdsrv:2380
            - --initial-cluster-token
            - etcd-cluster
            - --initial-cluster
            - etcd0=http://etcd0-0.etcdsrv:2380,etcd1=http://etcd1-0.etcdsrv:2380,etcd2=http://etcd2-0.etcdsrv:2380
            - --initial-cluster-state
            - new
            - --enable-pprof
            - --election-timeout
            - '5000'
            - --heartbeat-interval
            - '250'
            - --name
            - etcd2
            - --logger
            - zap
      volumes:
        - name: etcd-volume
          hostPath:
            path: /var/tmp/etcd2
            type: Directory
            

Q&A

Q: 第一次因为不熟悉,在大佬的基础上修改了service.yaml的name字段,然后部署后,各个Pod找不到对应主机导致集群失败
A:
Statefulset的serviceName要和service.yaml文件中的name字段一致才会生成对应规则的pod域名。Statefulset Pod域名的规则如下

$(statefulset-name)-$(ordinal-index).$(service-name).$(namespace).svc.cluster.local

按照规则域名,要访问etcd0的Pod,全域名需要使用
etcd0-0.etcdsrv.default.svc.cluster.local

因为k8s内部的/etc/resolv.conf一般会包含对应的search域,所以同一命名空间的访问域名可以简写成"etcd0-0.etcdsrv";不同命名空间的需要带上对应的命名空间,也就是"etcd0-0.etcdsrv.命名空间"

下面是其中一个Pod的/etc/resolv.conf

#default.svc.cluster.local中就是default就是当前命名空间,其他命名空间同理
root@ubuntutest:/# cat /etc/resolv.conf 
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local localdomain
options ndots:5
root@ubuntutest:/# 

Q: 部署好服务后,执行kubctl get pod 发现看不到对应的服务信息
A: 因为kubectl get pod默认是获取default命名空间的服务信息,因为在实际部署的时候,yaml文件里面指定了命名空间,所以执行kubectl的时候,需要带上命名空间
所以正确的命令应该是

[root@192 ~]#kubectl -n 命名空间 get pod

Q: 修改了cluster.yaml文件的数据存储路径,然后导致容器启动失败,启动失败的原因是找不到目录
A: 因为我是使用minikube 部署的,minikube是启动了一个k8s的容器做节点,部署时候的路径映射,是映射minikube 容器里的路径,而不是物理机,故而需要执行下面操作

minikube ssh #进入Minikube容器
mkdir -p /{etcd0,etcd1,etcd2} #创建目录

Q: minikube start 因为镜像拉取超时导致启动失败
A: 对于国内用户,执行minikube start命令的时候,加上–image-mirror-country=‘cn’,不然默认会拉取国外的仓库,从而导致启动失败,完整命令如下

minikube start --image-mirror-country='cn'

参考链接

k8s-club/etcd-operator
k8s-StatefulSet
minikube拉取镜像失败、安装慢、无法启动apiserver,一条命令解决

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐