什么是 etcd?

etcd是一个分布式一致性键值存储系统,用于共享配置和服务发现

  • 简单:定义清晰、面向用户的API(gRPC)

  • 安全:可选的客户端TLS证书自动认证

  • 快速:每秒10000写入

  • 可靠:使用Raft算法,实现分布式系统数据的可用性和一致性
    from: etcd_github etcd_io

etcd 使用

etcd厂商提供了命令行客户端 etcdctl,可以使用客户端直接跟etcd交互

etcdctl使用方法

WARNING:
   Environment variable ETCDCTL_API is not set; defaults to etcdctl v2.
   Set environment variable ETCDCTL_API=3 to use v3 API or ETCDCTL_API=2 to use v2 API.

#  etck8s会使用etcd v3版本的API记录数据。而默认etcdctl是使用v2版本的API,查看不到v3的数据。设置环境变量ETCDCTL_API=3后就OK了
# 设置环境变量 export ETCDCTL_API=3


USAGE:
   etcdctl [global options] command [command options] [arguments...]
   
VERSION:
   3.2.18
   
COMMANDS:
     backup          backup an etcd directory
     cluster-health  check the health of the etcd cluster
     mk              make a new key with a given value
     mkdir           make a new directory
     rm              remove a key or a directory
     rmdir           removes the key if it is an empty directory or a key-value pair
     get             retrieve the value of a key
     ls              retrieve a directory
     set             set the value of a key
     setdir          create a new directory or update an existing directory TTL
     update          update an existing key with a given value
     updatedir       update an existing directory
     watch           watch a key for changes
     exec-watch      watch a key for changes and exec an executable
     member          member add, remove and list subcommands
     user            user add, grant and revoke subcommands
     role            role add, grant and revoke subcommands
     auth            overall auth controls
     help, h         Shows a list of commands or help for one command

GLOBAL OPTIONS:
   --debug                          output cURL commands which can be used to reproduce the request
   --no-sync                        don't synchronize cluster information before sending request
   --output simple, -o simple       output response in the given format (simple, `extended` or `json`) (default: "simple")
   --discovery-srv value, -D value  domain name to query for SRV records describing cluster endpoints
   --insecure-discovery             accept insecure SRV records describing cluster endpoints
   --peers value, -C value          DEPRECATED - "--endpoints" should be used instead
   --endpoint value                 DEPRECATED - "--endpoints" should be used instead
   --endpoints value                a comma-delimited list of machine addresses in the cluster (default: "http://127.0.0.1:2379,http://127.0.0.1:4001")
   --cert-file value                identify HTTPS client using this SSL certificate file
   --key-file value                 identify HTTPS client using this SSL key file
   --ca-file value                  verify certificates of HTTPS-enabled servers using this CA bundle
   --username value, -u value       provide username[:password] and prompt if password is not supplied.
   --timeout value                  connection timeout per request (default: 2s)
   --total-timeout value            timeout for the command execution (except watch) (default: 5s)
   --help, -h                       show help
   --version, -v                    print the version

分布式一致性

在 etcd 中,该分布式一致性算法由 Raft 一致性算法完成
简单介绍:
Raft 一致性算法能够工作的一个关键点是:任意两个 quorum 的成员之间一定会有一个交集(公共成员),也就是说只要有任意一个 quorum 存活,其中一定存在某一个节点(公共成员),它包含着集群中所有的被确认提交的数据。正是基于这一原理,Raft 一致性算法设计了一套数据同步机制,在 Leader 任期切换后能够重新同步上一个 quorum 被提交的所有数据,从而保证整个集群状态向前推进的过程中保持数据的一致。

etcd 在 k8s中的作用:

它在kubernetes中主要用于存储所有需要持久化的数据

etcd应用场景:

  • 服务发现
    个人理解,分布式系统中,服务发现是比较常见的问题,在同一个分布式中的进程和服务,如何才能找到对方并建立连接,通过etcd保存这些进程的ip和监听的端口,通过名字就可以进行查找和连接
  • 配置中心
    个人理解,k8s中将各种配置存储于etcd中,所以会有版本的问题,文中查询相关数据可以看到数据都带有version
    通过watch机制,实时通知配置变化
    通过raft算法保持系统数据的cp和强一致性
    通过grpc proxy对同一个key 的watcher做优化
    提供权限控制和namespace机制供不同部门的使用
  • 负载均衡
    为了保证数据高可用、一致性,会把数据备份部署多份,即使一个服务也不影响使用,但是我个人认为这样会影响数据写入性能?但是数据访问可以做到负载均衡了,用户访问流量可以分流到不同的机器
  • 分布式锁
    raft算法保持强一致性
  • 集群监控与Leader竞选
    watch,当节点消失,配置有变动时,watcher第一时间发现告知用户
    也可以使用TTL Key机制,比如心跳协议来代表节点是否存在,完成各个节点的健康状态
    leader竞选,集群中有一个leader节点为主节点

选举算法:

  • 初始启动时,节点处于 follower 状态并被设定一个 election timeout,如果在这一时间周期内没有收到来自 leader 的 heartbeat,节点将发起选举:将自己切换为 candidate 之后,向集群中其它 follower 节点发送请求,询问其是否选举自己成为 leader。

  • 当收到来自集群中过半数节点的接受投票后,节点即成为 leader,开始接收保存 client 的数据并向其它的 follower 节点同步日志。如果没有达成一致,则 candidate 随机选择一个等待间隔(150ms ~ 300ms)再次发起投票,得到集群中半数以上 follower 接受的 candidate 将成为 leader

  • leader 节点依靠定时向 follower 发送 heartbeat 来保持其地位。

  • 任何时候如果其它 follower 在 election timeout 期间都没有收到来自 leader 的 heartbeat,同样会将自己的状态切换为 candidate 并发起选举。每成功选举一次,新 leader 的任期(Term)都会比之前 leader 的任期大 1。

用于进入容器,后证实该操作错误,起初设想需要进入到容器查询命令,后证实通过证书和etcdctl 可直接查询,见下方comment

查询所有服务

kubectl get ns

[root@master ~]# kubectl get ns
NAME                           STATUS   AGE
default                        Active   44h
istio-system                   Active   44h
kube-node-lease                Active   44h
kube-public                    Active   44h
kube-system                    Active   44h
kubesphere-alerting-system     Active   44h
kubesphere-controls-system     Active   44h
kubesphere-devops-system       Active   44h
kubesphere-logging-system      Active   44h
kubesphere-monitoring-system   Active   44h
kubesphere-system              Active   44h
openpitrix-system              Active   44h

查询kubesphere-system 下所有的pods

kubectl get -n kubesphere-system pods

[root@master ~]# kubectl get -n kubesphere-system pods
NAME                                     READY   STATUS    RESTARTS   AGE
etcd-854fb66c64-bj956                    1/1     Running   0          44h
ks-account-7f6795b8f7-ztnpt              1/1     Running   0          44h
ks-apigateway-8575c79746-x9p42           1/1     Running   0          44h
ks-apiserver-85f667fdc5-8qqnc            1/1     Running   0          44h
ks-console-6cbdd667cc-nnqk2              1/1     Running   0          44h
ks-controller-manager-7d55f876d8-rdw4n   1/1     Running   0          44h
ks-installer-6f87ffb44d-dt9pp            1/1     Running   0          44h
minio-8677cb6765-2mp9z                   1/1     Running   0          44h
mysql-647546d968-r4rld                   1/1     Running   0          44h
openldap-0                               1/1     Running   0          44h
redis-7bbcc855d5-8v9qs                   1/1     Running   0          44h

进入etcd pod

kubectl exec -it -n kubesphere-system etcd-854fb66c64-bj956 sh

找到了master节点下的

/opt/etcd_back/etcd_back.sh

从而找到了etcd的证书,找到证书才能用etcdctl v3方法查询数据

ETCDCTL_PATH='/usr/local/bin/etcdctl'
ENDPOINTS='192.168.1.100:2379'
ETCD_DATA_DIR="/var/lib/etcd"
BACKUP_DIR="/var/backups/kube_etcd/etcd-$(date +%Y-%m-%d_%H:%M:%S)"

ETCDCTL_CERT="/etc/ssl/etcd/ssl/admin-master.pem"
ETCDCTL_KEY="/etc/ssl/etcd/ssl/admin-master-key.pem"
ETCDCTL_CA_FILE="/etc/ssl/etcd/ssl/ca.pem"


[ ! -d $BACKUP_DIR ] && mkdir -p $BACKUP_DIR


export ETCDCTL_API=2;$ETCDCTL_PATH backup --data-dir $ETCD_DATA_DIR --backup-dir $BACKUP_DIR

sleep 3

{
export ETCDCTL_API=3;$ETCDCTL_PATH --endpoints="$ENDPOINTS" snapshot save $BACKUP_DIR/snapshot.db \
                                   --cacert="$ETCDCTL_CA_FILE" \
                                   --cert="$ETCDCTL_CERT" \
                                   --key="$ETCDCTL_KEY"
} > /dev/null

sleep 3

cd $BACKUP_DIR/../;ls -lt |awk '{if(NR>(5+1)){print "rm -rf "$9}}'|sh

因为etcd使用https,所以etcdctl客户端使用命令需要指定证书,如下证书通用格式
当然,如果嫌麻烦,设置环境变量即可

etcd的API分为两种, 分别用export ETCDCTL_API=3和export ETCDCTL_API=2来区分. 两种API的调用接口不同, 其数据组织形式也不同. API_2下,其key和value都存储在内存中,而API_3下,key存储在内存中,value存储在硬盘中

# V2版本使用格式
etcdctl --endpoints=https://192.168.1.100:2379 --cert-file=/etc/ssl/etcd/ssl/admin-master.pem --key-file=/etc/ssl/etcd/ssl/admin-master-key.pem  --ca-file=/etc/ssl/etcd/ssl/ca.pem

# v3版本使用格式
etcdctl --endpoints=https://192.168.1.100:2379 --cert=/etc/ssl/etcd/ssl/admin-master.pem --key=/etc/ssl/etcd/ssl/admin-master-key.pem  --cacert=/etc/ssl/etcd/ssl/ca.pem

# 使用命令put get del(v3)
# 使用前需要设置变量,默认使用v3接口
# 由于是键值对,所以使用方法跟redis雷同

export ETCDCTL_API=3

# 获取etcd的成员列表 (table形式)也可以使用json(-w json)  或者普通显示(member list) 
# -w table 可以对别的命令使用,一种输出格式化,表格的形式输出,json格式同理
[root@master home]# etcdctl --endpoints=https://192.168.1.100 --cert=/etc/ssl/etcd/ssl/admin-master.pem --key=/etc/ssl/etcd/ssl/admin-master-key.pem  --cacert=/etc/ssl/etcd/ssl/ca.pem -w table member list
+------------------+---------+-------+---------------------------+---------------------------+
|        ID        | STATUS  | NAME  |        PEER ADDRS         |       CLIENT ADDRS        |
+------------------+---------+-------+---------------------------+---------------------------+
| a11bcce1d2585b60 | started | etcd1 | https://192.168.1.100:2380 | https://192.168.1.100:2379 |
+------------------+---------+-------+---------------------------+---------------------------+

# 获取节点运行状态
etcdctl --endpoints=https://192.168.1.100:2379 --cert=/etc/ssl/etcd/ssl/admin-master.pem --key=/etc/ssl/etcd/ssl/admin-master-key.pem  --cacert=/etc/ssl/etcd/ssl/ca.pem endpoint health

https://192.168.1.100:2379 is healthy: successfully committed proposal: took = 1.044424ms



# put
[root@master home]# etcdctl --endpoints=https://192.168.1.100:2379 --cert=/etc/ssl/etcd/ssl/admin-master.pem --key=/etc/ssl/etcd/ssl/admin-master-key.pem  --cacert=/etc/ssl/etcd/ssl/ca.pem put foo bar
OK

# get
[root@master home]# etcdctl --endpoints=https://192.168.1.100:2379 --cert=/etc/ssl/etcd/ssl/admin-master.pem --key=/etc/ssl/etcd/ssl/admin-master-key.pem  --cacert=/etc/ssl/etcd/ssl/ca.pem get foo
foo
bar

# del
[root@master home]# etcdctl --endpoints=https://192.168.1.100:2379 --cert=/etc/ssl/etcd/ssl/admin-master.pem --key=/etc/ssl/etcd/ssl/admin-master-key.pem  --cacert=/etc/ssl/etcd/ssl/ca.pem del foo
1

更多操作及本文参考:etcdctl-github

该目录下存储了ks自动化备份etcd的快照数据

/var/backups

打开任意文件夹

[root@master var]# cd backups/
[root@master backups]# ls
etcd-2020-04-17_18:34:18  etcd-2020-05-06_19:33:14  etcd-2020-05-20_11:45:59  etcd-2020-05-23_16:16:37
etcd-2020-04-20_10:00:54  etcd-2020-05-07_09:41:31  etcd-2020-05-20_13:58:18  etcd-2020-05-23_16:35:07
etcd-2020-04-20_10:51:33  etcd-2020-05-07_10:16:01  etcd-2020-05-20_14:51:01  kube_etcd
etcd-2020-04-20_11:08:18  etcd-2020-05-19_17:07:46  etcd-2020-05-20_15:34:43

[root@master etcd-2020-04-17_18:34:18]# ls
member  snapshot.db
[root@master etcd-2020-04-17_18:34:18]# cd member/
[root@master member]# ls
snap  wal

snap #存放快照数据,etcd防止WAL文件过多而设置的快照,存储etcd数据状态

wal #存放预写式日志,最大的作用是记录了整个数据变化的全部历程。在etcd中,所有数据的修改在提交前,都要先写入到WAL中

etcd-watch

etcd里的watch,是一个监听一个或者一组key,key发生了任何变化都会发出消息,个人理解就是发布订阅的模式
k8s控制器通过watch接口来感知对应的资源的数据变更,从而根据资源对象中的期望状态与当前状态之间的差异,来决策业务逻辑的控制,watch本质上做的事情其实就是将感知到的事件发生给关注该事件的控制器

查询/registry下所有数据 部分,太多了

etcdctl --cert=/etc/ssl/etcd/ssl/admin-master.pem --key=/etc/ssl/etcd/ssl/admin-master-key.pem  --cacert=/etc/ssl/etcd/ssl/ca.pem get /registry --prefix --keys-only


/registry/services/specs/openpitrix-system/openpitrix-repo-manager

/registry/services/specs/openpitrix-system/openpitrix-rp-kubernetes

/registry/services/specs/openpitrix-system/openpitrix-rp-manager

/registry/services/specs/openpitrix-system/openpitrix-runtime-manager

/registry/services/specs/openpitrix-system/openpitrix-task-manager

/registry/statefulsets/kubesphere-devops-system/s2ioperator

/registry/statefulsets/kubesphere-logging-system/elasticsearch-logging-data

/registry/statefulsets/kubesphere-logging-system/elasticsearch-logging-discovery

/registry/statefulsets/kubesphere-monitoring-system/prometheus-k8s

/registry/statefulsets/kubesphere-monitoring-system/prometheus-k8s-system

/registry/statefulsets/kubesphere-system/openldap

/registry/storageclasses/local

/registry/tenant.kubesphere.io/workspaces/htwx

/registry/tenant.kubesphere.io/workspaces/system-workspace

/registry/validatingwebhookconfigurations/istio-galley

/registry/validatingwebhookconfigurations/validating-webhook-configuration

查询 /registry/storageclasses/local数据 json格式输出



etcdctl --cert=/etc/ssl/etcd/ssl/admin-master.pem --key=/etc/ssl/etcd/ssl/admin-ma  --cacert=/etc/ssl/etcd/ssl/ca.pem get -w json /registry/storageclasses/local


{
 "header": 
  {
   "cluster_id":13736729561285622628,
   "member_id":11609097734746299232,
   "revision":564863,
   "raft_term":1234: 
  },
 "kvs": 
  [{
    "key":"L3JlZ2lzdHJ5L3N0b3JhZ2VjbGFzc2VzL2xvY2Fs",
    "create_revision":1322,
    "mod_revision":1322,
    "version":1,
  
"value":"azhzAAohChFzdG9yYWdlLms4cy5pby92MRIMU3RvcmFnZUNsYXNzEqgHCvUGCgVsb2NhbBIAGgAiACokMDcxMDhhODktYjAwZC00YjIwLTlmZGMtZTZjNzg3NGNmNTYwMgA4AEIICOnDo/YFEABicQoVY2FzLm9wZW5lYnMuaW8vY29uZmlnElgtIG5hbWU6IFN0b3JhZ2VUeXBlCiAgdmFsdWU6ICJob3N0cGF0aCIKLSBuYW1lOiBCYXNlUGF0aAogIHZhbHVlOiAiL3Zhci9vcGVuZWJzL2xvY2FsLyIKYpwECjBrdWJlY3RsLmt1YmVybmV0ZXMuaW8vbGFzdC1hcHBsaWVkLWNvbmZpZ3VyYXRpb24S5wN7ImFwaVZlcnNpb24iOiJzdG9yYWdlLms4cy5pby92MSIsImtpbmQiOiJTdG9yYWdlQ2xhc3MiLCJtZXRhZGF0YSI6eyJhbm5vdGF0aW9ucyI6eyJjYXMub3BlbmVicy5pby9jb25maWciOiItIG5hbWU6IFN0b3JhZ2VUeXBlXG4gIHZhbHVlOiBcImhvc3RwYXRoXCJcbi0gbmFtZTogQmFzZVBhdGhcbiAgdmFsdWU6IFwiL3Zhci9vcGVuZWJzL2xvY2FsL1wiXG4iLCJvcGVuZWJzLmlvL2Nhcy10eXBlIjoibG9jYWwiLCJzdG9yYWdlY2xhc3MuYmV0YS5rdWJlcm5ldGVzLmlvL2lzLWRlZmF1bHQtY2xhc3MiOiJ0cnVlIiwic3RvcmFnZWNsYXNzLmt1YmVzcGhlcmUuaW8vc3VwcG9ydGVkX2FjY2Vzc19tb2RlcyI6IltcIlJlYWRXcml0ZU9uY2VcIl0ifSwibmFtZSI6ImxvY2FsIn0sInByb3Zpc2lvbmVyIjoib3BlbmVicy5pby9sb2NhbCIsInJlY2xhaW1Qb2xpY3kiOiJEZWxldGUiLCJ2b2x1bWVCaW5kaW5nTW9kZSI6IldhaXRGb3JGaXJzdENvbnN1bWVyIn0KYhwKE29wZW5lYnMuaW8vY2FzLXR5cGUSBWxvY2FsYjgKMHN0b3JhZ2VjbGFzcy5iZXRhLmt1YmVybmV0ZXMuaW8vaXMtZGVmYXVsdC1jbGFzcxIEdHJ1ZWJGCjFzdG9yYWdlY2xhc3Mua3ViZXNwaGVyZS5pby9zdXBwb3J0ZWRfYWNjZXNzX21vZGVzEhFbIlJlYWRXcml0ZU9uY2UiXXoAEhBvcGVuZWJzLmlvL2xvY2FsIgZEZWxldGU6FFdhaXRGb3JGaXJzdENvbnN1bWVyGgAiAA=="
  }],
 "count":100}

etcd 官方负载均衡方案:Load Balancing in gRPC

etcd 分布式锁 (慢慢看)


func ExampleMutex_Lock() {
	cli, err := clientv3.New(clientv3.Config{Endpoints: endpoints})
	if err != nil {
		log.Fatal(err)
	}
	defer cli.Close()

	// create two separate sessions for lock competition
	s1, err := concurrency.NewSession(cli)
	if err != nil {
		log.Fatal(err)
	}
	defer s1.Close()
	m1 := concurrency.NewMutex(s1, "/my-lock/")

	s2, err := concurrency.NewSession(cli)
	if err != nil {
		log.Fatal(err)
	}
	defer s2.Close()
	m2 := concurrency.NewMutex(s2, "/my-lock/")

	// acquire lock for s1
	if err := m1.Lock(context.TODO()); err != nil {
		log.Fatal(err)
	}
	fmt.Println("acquired lock for s1")

	m2Locked := make(chan struct{})
	go func() {
		defer close(m2Locked)
		// wait until s1 is locks /my-lock/
		if err := m2.Lock(context.TODO()); err != nil {
			log.Fatal(err)
		}
	}()

	if err := m1.Unlock(context.TODO()); err != nil {
		log.Fatal(err)
	}
	fmt.Println("released lock for s1")

	<-m2Locked
	fmt.Println("acquired lock for s2")

	// Output:
	// acquired lock for s1
	// released lock for s1
	// acquired lock for s2
}

代码来源github

etcd 事物(慢慢看)


func ExampleKV_txn() {
	cli, err := clientv3.New(clientv3.Config{
		Endpoints:   endpoints,
		DialTimeout: dialTimeout,
	})
	if err != nil {
		log.Fatal(err)
	}
	defer cli.Close()

	kvc := clientv3.NewKV(cli)

	_, err = kvc.Put(context.TODO(), "key", "xyz")
	if err != nil {
		log.Fatal(err)
	}

	ctx, cancel := context.WithTimeout(context.Background(), requestTimeout)
	_, err = kvc.Txn(ctx).
		// txn value comparisons are lexical
		If(clientv3.Compare(clientv3.Value("key"), ">", "abc")).
		// the "Then" runs, since "xyz" > "abc"
		Then(clientv3.OpPut("key", "XYZ")).
		// the "Else" does not run
		Else(clientv3.OpPut("key", "ABC")).
		Commit()
	cancel()
	if err != nil {
		log.Fatal(err)
	}

	gresp, err := kvc.Get(context.TODO(), "key")
	cancel()
	if err != nil {
		log.Fatal(err)
	}
	for _, ev := range gresp.Kvs {
		fmt.Printf("%s : %s\n", ev.Key, ev.Value)
	}
	// Output: key : XYZ
}

代码来源github

raft算法,有兴趣的可以看看论文

除了/registry/apiregistration.k8s.io是直接存储JSON格式的,其他资源默认都不是使用JSON格式直接存储,而是通过protobuf格式存储,当然这么做的原因是为了性能,除非手动配置--storage-media-type=application/json

参考github issue

查询 增加 | string 命令得到正确的数据格式

/registry/tenant.kubesphere.io/workspaces/htwx

etcdctl --cert=/etc/ssl/etcd/ssl/admin-master.pem --key=/etc/ssl/etcd/ssl/admin-master-key.pem  --cacert=/etc/ssl/etcd/ssl/ca.pem get /registry/tenant.kubesphere.io/workspaces/htwx | strings

{
 
 "apiVersion":100:"tenant.kubesphere.io/v1alpha1",
 "kind":"Workspace",
 "metadata":
    {
      "annotations": 
         {
            "kubesphere.io/creator":"qqy"
         },
      "creationTimestamp":"2020-05-25T08:16:49Z",
      "finalizers":
         [
            "finalizers.tenant.kubesphere.io"
         ],
      "generation":2,
      "name":"htwx",
      "uid":"0de5a138-eab3-408a-9ac8-ecb512143021"},
      "spec":
         {
            "manager":"qqy"
         },
      "status":{}
}

/registry/tenant.kubesphere.io/workspaces/system-workspace


{
  "apiVersion":"tenant.kubesphere.io/v1alpha1",
  "kind":"Workspace",
  "metadata":
    {
      "annotations":
        {
          "kubectl.kubernetes.io/last-applied-configuration":
"{\"apiVersion\":\"tenant.kubesphere.io/v1alpha1\",\"kind\":\"Workspace\",\"metadata\":{\"annotations\":{},\"name\":\"system-workspace\"},\"spec\":{\"manager\":\"admin\"}}\n"
        },
      "creationTimestamp":"2020-05-23T08:45:17Z",
      "finalizers":    
          ["finalizers.tenant.kubesphere.io"],
      "generation":2,
      "name":"system-workspace",
      "uid":"ce2e34c3-fd76-4514-a596-f5264d820e79"},
      "spec":
           {
               "manager":"admin"
           },
      "status":{}
}
Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐