环境描述

kubernetes 组建的运行方式

kubelet : systemd 运行
其他都是docker起的容器

问题描述

1.有pod状态处于Unknow状态

[root@master-64 ~]# kubectl get pods adminapi-www-idc-1846448753-k5gbm -n yuntu-www-idc -owide
NAME                                READY     STATUS    RESTARTS   AGE       IP               NODE
adminapi-www-idc-1846448753-k5gbm   1/1       Unknown   0          14d       192.168.217.25   slave-203

2.docker 进程已死

[root@slave-203 ~]# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2018-01-30 18:27:44 CST; 2h 54min ago
     Docs: https://docs.docker.com
  Process: 754 ExecStart=/usr/bin/dockerd (code=exited, status=1/FAILURE)
 Main PID: 754 (code=exited, status=1/FAILURE)

Jan 30 18:27:41 slave-203 systemd[1]: Starting Docker Application Container Engine...
Jan 30 18:27:43 slave-203 dockerd[754]: time="2018-01-30T18:27:43.633366759+08:00" level=info msg="libcontainerd: new containerd process, pid: 1895"
Jan 30 18:27:44 slave-203 dockerd[754]: time="2018-01-30T18:27:44.651806419+08:00" level=error msg="[graphdriver] prior storage driver \"devicemap...thinpool"
Jan 30 18:27:44 slave-203 dockerd[754]: time="2018-01-30T18:27:44.652140761+08:00" level=fatal msg="Error starting daemon: error initializing grap...thinpool"
Jan 30 18:27:44 slave-203 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Jan 30 18:27:44 slave-203 systemd[1]: Failed to start Docker Application Container Engine.
Jan 30 18:27:44 slave-203 systemd[1]: Unit docker.service entered failed state.
Jan 30 18:27:44 slave-203 systemd[1]: docker.service failed.
[root@slave-203 ~]# journalctl -f -u kubelet
-- Logs begin at Tue 2018-01-30 18:27:28 CST. --
Jan 30 18:27:44 slave-203 systemd[1]: Dependency failed for kubernetes Kubelet.
Jan 30 18:27:44 slave-203 systemd[1]: Job kubelet.service/start failed with result 'dependency'.

系统日志docker

....
Jan 30 08:06:15 slave-203 kubelet: ERROR:0130 08:06:15.196289    5516 docker_sandbox.go:492] Failed to retrieve checkpoint for sandbox "b85076ca2595020a4caa26993548d02ec68300f396a4c0096a0bf4650b1d3d74": checkpoint is not found.
....
Jan 30 18:27:43 slave-203 dockerd: time="2018-01-30T18:27:43.633366759+08:00" level=info msg="libcontainerd: new containerd process, pid: 1895"
Jan 30 18:27:44 slave-203 dockerd: time="2018-01-30T18:27:44.651806419+08:00" level=error msg="[graphdriver] prior storage driver \"devicemapper\" failed: devicemapper: Non existing device docker-thinpool"
Jan 30 18:27:44 slave-203 dockerd: time="2018-01-30T18:27:44.652140761+08:00" level=fatal msg="Error starting daemon: error initializing graphdriver: devicemapper: Non existing device docker-thinpool"
Jan 30 18:27:44 slave-203 systemd: docker.service: main process exited, code=exited, status=1/FAILURE
Jan 30 18:27:44 slave-203 systemd: Unit docker.service entered failed state.
Jan 30 18:27:44 slave-203 systemd: docker.service failed.
Jan 30 18:27:51 slave-203 lvm: 1 logical volume(s) in volume group "docker" now active

系统日志kubelet

...
Jan 30 20:08:56 slave-203 kubelet: ERROR:0130 20:08:56.282727    5516 kubelet_network.go:412] Failed to ensure marking rule for KUBE-MARK-MASQ: error checking rule: exit status 4: iptables: Resource temporarily unavailable.
...
Jan 30 20:17:57 slave-203 kubelet: INFO:0130 20:17:57.328764    5516 qos_container_manager_linux.go:286] [ContainerManager]: Updated QoS cgroup configuration
Jan 30 20:18:06 slave-203 kubelet: INFO:0130 20:18:06.569694    5516 server.go:794] GET /metrics: (4.914929ms) 200 [[Prometheus/2.0.0] 10.39.1.62:58616]
Jan 30 18:27:44 slave-203 systemd: Job kubelet.service/start failed with result 'dependency'.

这个问题 现在还处于unknow状态,紧急解决办法就是
执行以下命令

 systemctl restart docker

END

Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐