问题解决

1、k8s高可用集群,master2节点的etcd与kube-apiversion显示 CrashLoopBackOff

就重启一次,出现问题了,找了网上很多教程,没个头绪
后面想到了节点驱离,再重新加入节点

#先查看pod
[root@master1 wordpress]# kubectl get pod -n kube-system 
NAME                                       READY   STATUS             RESTARTS   AGE
etcd-master2                               0/1     CrashLoopBackOff   38         2m24s
kube-apiserver-master2                     0/1     CrashLoopBackOff   37         18m
#1. 将master2进行驱离节点:
[root@master1 wordpress]# kubectl drain master2 
[root@master1 wordpress]# kubectl delete nodes master2 

#节点驱离失败
kubectl drain 节点名 --delete-local-data --ignore-daemonsets --force 
#2. 在master2节点上
[root@master2 ~]# kubeadm reset -f
[root@master2 ~]# rm -rf ~/.kube
#3. 再次生成加入集群的Token,因为Token只有24小时的有效期:
[root@master1 wordpress]# kubeadm token create
1fv1fc.jfxp3v0ock17sip2     #此为新的token
​
[root@master1 wordpress]# kubeadm token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
1fv1fc.jfxp3v0ock17sip2   23h         2021-10-27T18:18:14+08:00   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token   #这里可以看到刚新建的Token
​
#4. 再次生成加入集群的密钥:
[root@master1 wordpress]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
1ac341aa5821901febd49ae1345db6a11013816457e662bb7b952c0d1ad74e1d
#5. 免密,拷贝相关的文件至master2
[root@master1 wordpress]# ssh-keygen
[root@master1 wordpress]# ssh-copy-id root@master2

[root@master1 ~]# ssh root@master2 mkdir -p /etc/kubernetes/pki/etcd
[root@master1 ~]# scp -r /etc/kubernetes/admin.conf root@master2:/etc/kubernetes/
[root@master1 ~]# scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master2:/etc/kubernetes/pki
[root@master1 ~]# scp /etc/kubernetes/pki/etcd/ca.* root@master2:/etc/kubernetes/pki/etcd
 

2、在这里还会出现一个问题,up主之间就碰坑了,上方操作做完之后,还不能直接去重新加入节点

当master2集群出现问题,etcd与kube-apiversion 出现问题

将master2驱离节点后,重新加入报错:

W1026 17:51:19.991886 21799 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [check-etcd] Checking that the etcd cluster is healthy error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://192.168.178.39:2379 with maintenance client: context deadline exceeded To see the stack trace of this error execute with --v=5 or higher

#1. 去连接etcd-master1内部,将残留的master2删除:
[root@master1 wordpress]# kubectl exec -it -n kube-system etcd-master1 -- /bin/sh

# export ETCDCTL_API=3

#  alias etcdctl='etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key'

# etcdctl member list
4fcda51ece5420bf, started, master2, https://192.168.178.39:2380, https://192.168.178.39:2379, false
7ea96ffc61ae5b0a, started, master1, https://192.168.178.36:2380, https://192.168.178.36:2379, false
935a2cef042b600c, started, master3, https://192.168.178.40:2380, https://192.168.178.40:2379, false

# etcdctl member remove 4fcda51ece5420bf                        
Member 4fcda51ece5420bf removed from cluster 3762c94a3b4c6fb3

# exit

​3、master2重新加入节点

[root@master2 ~]# kubeadm join 192.168.178.100:6443 --token m1124m.45sa9l5686v1w0ta \
--discovery-token-ca-cert-hash sha256:1ac341aa5821901febd49ae1345db6a11013816457e662bb7b952c0d1ad74e1d  \
--control-plane
#再次重新加入节点
#etcd与kube-apiversion 恢复成功
[root@master1 wordpress]# kubectl get node
NAME      STATUS   ROLES    AGE   VERSION
master1   Ready    master   9d    v1.17.4
master2   Ready    master   23m   v1.17.4
master3   Ready    master   8d    v1.17.4
node1     Ready    <none>   8d    v1.17.4
node2     Ready    <none>   8d    v1.17.4

[root@master1 wordpress]# kubectl get pod -n kube-system 
NAME                                       READY   STATUS    RESTARTS   AGE
calico-kube-controllers-56dfd8fc57-kdk5v   1/1     Running   0          70m
calico-node-4kpgs                          1/1     Running   0          23m
calico-node-dk6wj                          1/1     Running   4          8d
calico-node-ff7cn                          1/1     Running   3          8d
calico-node-m6mjk                          1/1     Running   2          8d
calico-node-sj6qt                          1/1     Running   5          8d
coredns-6955765f44-hnc72                   1/1     Running   5          9d
coredns-6955765f44-qt552                   1/1     Running   5          9d
etcd-master1                               1/1     Running   5          9d
etcd-master2                               1/1     Running   0          23m
etcd-master3                               1/1     Running   1          8d
kube-apiserver-master1                     1/1     Running   2          3h43m
kube-apiserver-master2                     1/1     Running   0          23m
kube-apiserver-master3                     1/1     Running   1          8d
kube-controller-manager-master1            1/1     Running   13         9d
kube-controller-manager-master2            1/1     Running   0          23m
kube-controller-manager-master3            1/1     Running   13         8d
kube-proxy-7r7nf                           1/1     Running   0          103m
kube-proxy-kg25b                           1/1     Running   0          23m
kube-proxy-pfrwd                           1/1     Running   0          103m
kube-proxy-wcjj9                           1/1     Running   0          103m
kube-proxy-z6hlr                           1/1     Running   0          103m
kube-scheduler-master1                     1/1     Running   17         9d
kube-scheduler-master2                     1/1     Running   0          23m
kube-scheduler-master3                     1/1     Running   8          8d
​

​

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐