前天 haproxy和keepalived部署3个master节点高可用Kubernetes 集群 ,不小心将2个worker 节点也都作为 master 节点 join 了,后来直接 kubeadm reset 再想 join 的时候,已经超过 24 小时!

官方文档

  1. kubeadm token

    生成新的 token

     kubeadm token create [token]
    

    列举所有的token

     kubeadm token list [flags]
    
  2. kubeadm join

    使用 OpenSSL CLI 生成CA 键哈希
    在文档中找到 带 CA 锁定模式的基于令牌的发现,命令样本如下

     openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
    

    他这里还列举了 2 个kubeadm join 命令示例

    对于工作节点:

     kubeadm join --discovery-token abcdef.1234567890abcdef --discovery-token-ca-cert-hash sha256:1234..cdef 1.2.3.4:6443
    

    对于控制面节点:

     kubeadm join --discovery-token abcdef.1234567890abcdef --discovery-token-ca-cert-hash sha256:1234..cdef --control-plane 1.2.3.4:6443
    

以下是我这里的操作记录

从 master1-141 节点删除出错的 2 个 worker 节点

应该不必delete ,甚至 reboot !这是我头一次遇到这种情况,所以,尽量简化环境

	[root@master1-141 working]# kubectl get nodes
	NAME          STATUS     ROLES                  AGE   VERSION
	master0-140   Ready      control-plane,master   46h   v1.22.2
	master1-141   Ready      control-plane,master   46h   v1.22.2
	master2-142   Ready      control-plane,master   46h   v1.22.2
	node3-143     NotReady   control-plane,master   46h   v1.22.2
	node4-144     NotReady   control-plane,master   46h   v1.22.2
	
	[root@master1-141 working]# kubectl delete node node3-143 
	node "node3-143" deleted
	
	[root@master1-141 working]# kubectl delete node node4-144
	node "node4-144" deleted
	
	[root@master1-141 working]# kubectl get nodes
	NAME          STATUS   ROLES                  AGE   VERSION
	master0-140   Ready    control-plane,master   46h   v1.22.2
	master1-141   Ready    control-plane,master   46h   v1.22.2
	master2-142   Ready    control-plane,master   46h   v1.22.2

现在只剩下 3 个 master 节点了!

master节点重新生成 token 和 CA key

  1. 重新生成 token

    [root@master1-141 working]# kubeadm token create

     ce2dbj.qx3k5py7auj0hcit
    

    [root@master1-141 working]# kubeadm token list

     TOKEN                     TTL         EXPIRES                USAGES                   DESCRIPTION                                                EXTRA GROUPS
     
     ce2dbj.qx3k5py7auj0hcit   23h         2021-11-14T06:17:23Z   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token
     ewodnv.g64hsqgs49pqkw0b   23h         2021-11-14T06:18:51Z   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token
    
  2. 重新生成 CA key

    [root@master1-141 working]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed ‘s/^.* //’

抄写以上官网文档命令样本

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

重新制作 join 指令

  1. 这是从 log 查出来的之前 woker node 的 join 指令

     kubeadm join 192.168.0.149:6444 --token abcdef.0123456789abcdef \
     	--discovery-token-ca-cert-hash sha256:0e3deff56a511ffa3fcb66ba0ecd378438ca6332aaa7fb187808118a5640f6f0 
    

    替换中间的 token 和 CA key 之后,woerker 节点 kube reset 之后

    总是报错!

     [root@node3-143 ~]# kubeadm join --discovery-token ewodnv.g64hsqgs49pqkw0b --discovery-token-ca-cert-hash 0b905eab6e0725c7716b7192320c2da5e08bc1b53ec8f95595e863dfe2db1eb5 192.168.0.149:6444
     [preflight] Running pre-flight checks
     error execution phase preflight: couldn't validate the identity of the API Server: invalid discovery token CA certificate hash: invalid hash, expected "format:hex-value". Known format(s) are: sha256
     To see the stack trace of this error execute with --v=5 or higher
    

    加上 --v=5 查看错误如下

     ...
     /usr/local/go/src/runtime/asm_amd64.s:1371
     cluster CA found in cluster-info ConfigMap is invalid
    

    重新生成 token 和 CA key 还是不行,干脆 reboot!

    重启之后再来,就 ok 了!
    具体原因不明!可能和我之前错误的机盎然使用了 master 节点 join 指令有关!

  2. 另一个 worker 节点就直接 reset 后 reboot ,直接就 join 成功

     [root@node4-144 ~]# kubeadm join 192.168.0.149:6444 --token ce2dbj.qx3k5py7auj0hcit  \
     > --discovery-token-ca-cert-hash sha256:0b905eab6e0725c7716b7192320c2da5e08bc1b53ec8f95595e863dfe2db1eb5
     > 
     [preflight] Running pre-flight checks
     [preflight] Reading configuration from the cluster...
     [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
     [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
     [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
     [kubelet-start] Starting the kubelet
     [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
     
     This node has joined the cluster:
     * Certificate signing request was sent to apiserver and a response was received.
     * The Kubelet was informed of the new secure connection details.
     
     Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
    
  3. 回到 master 节点确认

    [root@master1-141 working]# kubectl get nodes

     NAME          STATUS   ROLES                  AGE   VERSION
     master0-140   Ready    control-plane,master   47h   v1.22.2
     master1-141   Ready    control-plane,master   47h   v1.22.2
     master2-142   Ready    control-plane,master   47h   v1.22.2
     node3-143     Ready    <none>                 24m   v1.22.2
     node4-144     Ready    <none>                 22m   v1.22.2
    
Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐