#参考资料
https://stackoverflow.com/questions/57090991/rancher-etcd-inner-db-cannot-clean/57523990

现象:rancher登录失败,k8s集群正常

rancher的日志里错误提示

{"log":"E1016 08:56:40.853629       6 leaderelection.go:286] Failed to update lock: etcdserver: mvcc: database space exceeded\n","stream":"stderr","time":"2021-10-16T08:56:40.853715974Z"}
{"log":"E1016 08:56:41.103187       6 controller.go:199] unable to sync kubernetes service: etcdserver: mvcc: database space exceeded\n","stream":"stderr","time":"2021-10-16T08:56:41.103319887Z"}
{"log":"E1016 08:56:42.928857       6 status.go:64] apiserver received an error that is not an metav1.Status: rpctypes.EtcdError{code:0x8, desc:\"etcdserver: mvcc: database space exceeded\"}\n","stream":"stderr","time":"2021-10-16T08:56:42.92896201Z"}
docker run --net=container:ab20a5d53a9c -id --name etcd-utility rancher/rke-tools:v0.1.40

问题原因:rancher的数据存储在内置的etcd服务中,时间长了,达到限额

解决办法:

运行一个etcd的容器,作为客户端连进rancher的容器

docker run --net=container:ab20a5d53a9c -id --name etcd-utility rancher/rke-tools:v0.1.40
docker exec etcd-utility etcdctl member list

显示如下

8e9e05c52164694d: name=default peerURLs=http://localhost:2380 clientURLs=http://localhost:2379 isLeader=true

进入etcd客户端容器

docker exec -it etcd-utility bash

执行etcd命令,压缩日志

export ETCDCTL_API=3
etcdctl endpoint status --endpoints=$(etcdctl member list | cut -d, -f5 | sed -e 's/ //g' | paste -sd ',') --write-out table
etcdctl compact `etcdctl endpoint status --write-out json | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*'`
etcdctl defrag  `etcdctl endpoint status --write-out json | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*'`

查看和消除报警

etcdctl alarm list
etcdctl alarm disarm

恢复正常后可以停掉临时起的etcd容器

docker stop etcd-utility
Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐