K8S部署常见问题归纳
[kubelet-check] It seems like the kubelet isn't running or healthy.[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/heal
目录
一. 常用错误发现手段
我们在部署经常看到的提示是:
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
这些提示尚不能给与明确的问题细节,常用下列手段进一步发现问题:
- 服务状态
systemctl status kubelet
- 时间日志
journalctl -xeu kubelet
- 执行jion或者init的时候添加 --v=5
- 查询系统日志/var/log/messages,例如
tail /var/log/messages
二、错误问题
1. token 过期
问题:
日志发现token过期提示,例如:
The cluster-info ConfigMap does not yet contain a JWS signature for token ID “42mf2r”, will try again
解决方案:
- 在master机器上执行查询token信息
kubeadm token list
- 如果查询不到信息,锁门token过期了,需要重新生成token
kubeadm token create
查看discovery-token-ca-cert-hash
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
- 生成后可以替换kubeadm join语句中的token部分
kubeadm join 192.168.0.10:6443 --token vpm4o3.p4bn3co35bplw77m
–discovery-token-ca-cert-hash sha256:23d0fe16f3e825ef81d9682d3dc3a706fdad1712d4213ad4243a7d6d7abd5c36
2. 时间同步问题
问题:
报错:Failed to request cluster-info, will try again: certificate has expired or is not yet valid
解决方案:
ntpdate ntp.aliyun.com
3. docker Cgroup Driver 不是systemd
问题:
server.go:302] “Failed to run kubelet” err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: “systemd” is different …
解决方案:
- 修改docker服务的配置文件,“/etc/docker/daemon.json ”文件,添加如下
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
重启dokcer服务:
sudo systemctl daemon-reload
sudo systemctl restart docker
修改后查看docker的 cgroup
docker info |grep "Cgroup Driver"
Cgroup Driver: systemd #已经更新为了systemd
重启kuberlet:
systemctl restart kubelet
4. Failed to create cgroup(未验证)
问题:
“Failed to create cgroup” err="Cannot set property TasksAccountin
解决方案:
yum update systemd
子节点误执行kubeadm reset
问题:
子节点误执行kubeadm reset
解决方案:
- 删除/etc/kubernetes/下所有文件
- kubeadm reset
- 重新join
更多推荐
所有评论(0)