kubernetes kubelet挂掉问题分析
环境描述kubernetes 组建的运行方式kubelet : systemd 运行其他都是docker起的容器问题描述1.有pod状态处于Unknow状态[root@master-64 ~]# kubectl get pods adminapi-www-idc-1846448753-k5gbm -n yuntu-www-idc -owideNAME
·
环境描述
kubernetes 组建的运行方式
kubelet : systemd 运行
其他都是docker起的容器
问题描述
1.有pod状态处于Unknow
状态
[root@master-64 ~]# kubectl get pods adminapi-www-idc-1846448753-k5gbm -n yuntu-www-idc -owide
NAME READY STATUS RESTARTS AGE IP NODE
adminapi-www-idc-1846448753-k5gbm 1/1 Unknown 0 14d 192.168.217.25 slave-203
2.docker 进程已死
[root@slave-203 ~]# systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2018-01-30 18:27:44 CST; 2h 54min ago
Docs: https://docs.docker.com
Process: 754 ExecStart=/usr/bin/dockerd (code=exited, status=1/FAILURE)
Main PID: 754 (code=exited, status=1/FAILURE)
Jan 30 18:27:41 slave-203 systemd[1]: Starting Docker Application Container Engine...
Jan 30 18:27:43 slave-203 dockerd[754]: time="2018-01-30T18:27:43.633366759+08:00" level=info msg="libcontainerd: new containerd process, pid: 1895"
Jan 30 18:27:44 slave-203 dockerd[754]: time="2018-01-30T18:27:44.651806419+08:00" level=error msg="[graphdriver] prior storage driver \"devicemap...thinpool"
Jan 30 18:27:44 slave-203 dockerd[754]: time="2018-01-30T18:27:44.652140761+08:00" level=fatal msg="Error starting daemon: error initializing grap...thinpool"
Jan 30 18:27:44 slave-203 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Jan 30 18:27:44 slave-203 systemd[1]: Failed to start Docker Application Container Engine.
Jan 30 18:27:44 slave-203 systemd[1]: Unit docker.service entered failed state.
Jan 30 18:27:44 slave-203 systemd[1]: docker.service failed.
[root@slave-203 ~]# journalctl -f -u kubelet
-- Logs begin at Tue 2018-01-30 18:27:28 CST. --
Jan 30 18:27:44 slave-203 systemd[1]: Dependency failed for kubernetes Kubelet.
Jan 30 18:27:44 slave-203 systemd[1]: Job kubelet.service/start failed with result 'dependency'.
系统日志docker
....
Jan 30 08:06:15 slave-203 kubelet: ERROR:0130 08:06:15.196289 5516 docker_sandbox.go:492] Failed to retrieve checkpoint for sandbox "b85076ca2595020a4caa26993548d02ec68300f396a4c0096a0bf4650b1d3d74": checkpoint is not found.
....
Jan 30 18:27:43 slave-203 dockerd: time="2018-01-30T18:27:43.633366759+08:00" level=info msg="libcontainerd: new containerd process, pid: 1895"
Jan 30 18:27:44 slave-203 dockerd: time="2018-01-30T18:27:44.651806419+08:00" level=error msg="[graphdriver] prior storage driver \"devicemapper\" failed: devicemapper: Non existing device docker-thinpool"
Jan 30 18:27:44 slave-203 dockerd: time="2018-01-30T18:27:44.652140761+08:00" level=fatal msg="Error starting daemon: error initializing graphdriver: devicemapper: Non existing device docker-thinpool"
Jan 30 18:27:44 slave-203 systemd: docker.service: main process exited, code=exited, status=1/FAILURE
Jan 30 18:27:44 slave-203 systemd: Unit docker.service entered failed state.
Jan 30 18:27:44 slave-203 systemd: docker.service failed.
Jan 30 18:27:51 slave-203 lvm: 1 logical volume(s) in volume group "docker" now active
系统日志kubelet
...
Jan 30 20:08:56 slave-203 kubelet: ERROR:0130 20:08:56.282727 5516 kubelet_network.go:412] Failed to ensure marking rule for KUBE-MARK-MASQ: error checking rule: exit status 4: iptables: Resource temporarily unavailable.
...
Jan 30 20:17:57 slave-203 kubelet: INFO:0130 20:17:57.328764 5516 qos_container_manager_linux.go:286] [ContainerManager]: Updated QoS cgroup configuration
Jan 30 20:18:06 slave-203 kubelet: INFO:0130 20:18:06.569694 5516 server.go:794] GET /metrics: (4.914929ms) 200 [[Prometheus/2.0.0] 10.39.1.62:58616]
Jan 30 18:27:44 slave-203 systemd: Job kubelet.service/start failed with result 'dependency'.
这个问题 现在还处于unknow状态,紧急解决办法就是
执行以下命令
systemctl restart docker
END
更多推荐
已为社区贡献12条内容
所有评论(0)