根据Rancher官网搭建的集群成功后,使用docker删除之前退出的容器不小心把正在运行的rancher容器也给删除了,又重新安装rancher,然后创建集群后,最后一步总是不成功,报错:

[etcd] Failed to bring up Etcd Plane: etcd cluster is unhealthy: hosts [192.168.100.666] failed to report healthy. Check etcd container logs on each host for more information ,通过查看主机etcd容器日志显示报错为:

2020-06-19 09:21:16.177823 I | embed: rejected connection from "192.168.100.666:59010" (error "tls: failed to verify client's certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kube-ca\")", ServerName "")

其实这个问题的主要原因还是上个集群的一些配置没有清除,导致配置文件不同步之类的原因,在github找到的解决方案如下。我最后一个命令没有运行成功,但是再重装集群就成功了。下面的命令在集群的master上执行就行(或者是在安装rancher的机器上执行)

解决方案:

docker stop $(docker ps -aq)
docker system prune -f
docker volume rm $(docker volume ls -q)
docker image rm $(docker image ls -q)
rm -rf /etc/ceph \
       /etc/cni \
       /etc/kubernetes \
       /opt/cni \
       /opt/rke \
       /run/secrets/kubernetes.io \
       /run/calico \
       /run/flannel \
       /var/lib/calico \
       /var/lib/etcd \
       /var/lib/cni \
       /var/lib/kubelet \
       /var/lib/rancher/rke/log \
       /var/log/containers \
       /var/log/pods \
       /var/run/calico

 

Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐