ETCD数据库异常:mvcc: database space exceeded解决
ETCD数据库异常:mvcc: database space exceeded解决问题来源:在k8s集群中给node打标签发现报错[root@master1]# kubectl label node30.4.228.20 env=prodError from server: etcdserver: mvcc: database space exceeded原因分析:e...
·
ETCD数据库异常:mvcc: database space exceeded解决
问题来源:在k8s集群中给node打标签发现报错
[root@master1]# kubectl label node 30.4.228.20 env=prod Error from server: etcdserver: mvcc: database space exceeded
环境信息
etcd集群:30.4.228.19,30.4.228.20,30.4.228.22 (配置了安全加密)
原因分析:
- etcd服务未设置自动压缩参数(auto-compact)
- etcd 默认不会自动 compact,需要设置启动参数,或者通过命令进行compact,如果变更频繁建议设置,否则会导致空间和内存的浪费以及错误。Etcd v3 的默认的 backend quota 2GB,如果不 compact,boltdb 文件大小超过这个限制后,就会报错:”Error: etcdserver: mvcc: database space exceeded”,导致数据无法写入。
处理过程:
- 1、 获取旧版本号 :
[root@etcd1]# export ETCDCTL_API=3 #使用 api version 3
[root@etcd1]# rev=$(/usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/etcd/ssl/etcd.pem \
--key=/etc/etcd/ssl/etcd-key.pem \
--endpoints="https://127.0.0.1:2379" \
endpoint status --write-out="json" \
| egrep -o '"revision":[0-9]*' \
| egrep -o '[0-9].*')
[root@etcd1]# echo $rev
- 2 、压缩旧版本
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.20:2379" compact $rev
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.20:2379" defrag
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.20:2379" alarm list
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.19:2379" compact $rev
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.19:2379" defrag
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.22:2379" compact $rev
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.22:2379" defrag
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.22:2379" alarm list
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.22:2379" alarm disarm
- 3、查看告警
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.22:2379" alarm list
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.19:2379" alarm list
[root@etcd1]# /usr/local/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints="https://30.4.228.20:2379" alarm list
etcd相关命令:
1、设置etcd配额:
# 设置16MB的配额
etcd --quota-backend-bytes=$((16*1024*1024))
2、触发配额耗尽:
while [1];do dd if=dev/urandom bs=1024 count=1024 \
| ETCDCTL_API=3 etcdctl put key || break; done
...
Error: rpc error: code = 8 desc = etcdserver: mvcc:database space exceeded
3、确认数据空间超出配额:
ETCDCTL_API=3 etcdctl --write-out=table endpoint status
----------------+------------------+-----------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------+------------------+-----------+---------+-----------+-----------+------------+
| 127.0.0.1:2379 | bf9071f4639c75cc | 2.3.0+git | 18 MB | true | 2 | 3332 |
+----------------+------------------+-----------+---------+-----------+-----------+------------+
4、查看告警:
ETCDCTL_API=3 etcdctl alarm list
5、整合压缩、碎片整理:
1.获取当前etcd数据的修订版本(revision)
rev=$(ETCDCTL_API=3 etcdctl --endpoints=:2379 endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*')
2.整合压缩旧版本数据
ETCDCTL_API=3 etcdctl compact $rev
3.执行碎片整理
ETCDCTL_API=3 etcdctl defrag
4.解除告警
ETCDCTL_API=3 etcdctl alarm disarm
5.备份以及查看备份数据信息
ETCDCTL_API=3 etcdctl snapshot save backup.db
ETCDCTL_API=3 etcdctl snapshot status backup.db
更多推荐
已为社区贡献7条内容
所有评论(0)