Kubernetes(k8s)按官方文档描述安装和搭建集群遇到kubelet状态异常:

环境: Cenots 7.9(2009) adm64

我在搭建master节点时通过以下命令安装了docker、kubelet、kubectl、kubeadm

yum install -y docker
yum install -y kubeadm-1.23.5
yum install -y kubelet-1.23.5
yum install -y kubectl-1.23.5

出现问题

  • 6443端口没有被监听
  • 查询kubelet服务统计信息显示master节点找不到
  • kubeadm init失败

kubelet服务的状态是active(running)

"Error getting node" err="node \"master\" not found"
"Container not found in pod's containers" containerID="b62a3a547c8ec57a7d3ead092eb2327c7cdf048646dbf66a553e3897f0d7305b"
"Failed to get status for pod" podUID=5194f92899c09489015cb06df2025e5a pod="kube-system/etcd-master" err="Get \"https://10.8.126.46:6443/api/v1/namespaces/kube-system/pods/etcd-master\": dial tcp 10.8.126.46:6443: connect: connection refused"
"Error getting node" err="node \"master\" not found"
"Container not found in pod's containers" containerID="7af1254c11eae47db66a215f886c885deb2154ce1dda831eca17849e91320f4f"
"Failed to get status for pod" podUID=c7705c73b73c93db911388082f346570 pod="kube-system/kube-scheduler-master" err="Get \"https://10.8.126.46:6443/api/v1/namespaces/kube-system/pods/kube-scheduler-master\": dial tcp 10.8.126.46:6443: connect: connection refused"
"Error getting node" err="node \"master\" not found"
"Error getting node" err="node \"master\" not found"

原因分析:

docker的版本不兼容,直接使用yum install -y docker去安装docker我这边安装下来的版本默认是1.13.1,怀疑k8s的版本太新,docker版本太旧从而导致部分特性不支持,不兼容,经查阅相关资料和验证,也确认了这一问题。

参考:stackoverflow的相关问题


解决方案:

  • 卸载原有docker
systemctl stop docker
yum remove erase docker docker-common docker-client cockpit-docker

  • 安装新版本docker-ce
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum install -y docker-ce-20.10.14-3.el7
systemctl enable docker
systemctl start docker
  • 重新设置kubeadm
kubeadm reset
kubeadm init \
  --apiserver-advertise-address=10.8.126.46 \
  --image-repository registry.aliyuncs.com/google_containers \
  --kubernetes-version v1.23.5 \
  --service-cidr=10.96.0.0/12 \
  --pod-network-cidr=10.244.0.0/16
  • 拓展
    可能上面问题处理完后还可能存在kubelet服务无法启动的问题,通过journalctl -xeu kubelet查询得知cgroup驱动不一致,参考官方文档进行相应修改即可:
    kubelet cgroup配置
    CRI cgroup配置
Logo

开源、云原生的融合云平台

更多推荐