容器无法启动
现象:4月5日上班,发现关于容器服务的一个负载均衡上报一个后端节点异常,查看容器集群未发现异常现象,随及登录异常节点,发现容器就那么几个,启动容器,报错:Error response from daemon: subnet sandbox join failed for "172.19.0.0/16": could not find an appropriate master "ov-00
现象:
4月5日上班,发现关于容器服务的一个负载均衡上报一个后端节点异常,查看容器集群未发现异常现象,随及登录异常节点,发现容器就那么几个,启动容器,报错:
Error response from daemon: subnet sandbox join failed for "172.19.0.0/16": could not find an appropriate master "ov-000100-5873e" for "vx-000100-5873e"
Error: failed to start containers: acsrouting_routing_4
查看日志发现:"Peer add failed in the driver: subnet sandbox join failed for \"172.19.0.0/16\": could not find an appropriate master \
"ov-000100-5873e\" for \"vx-000100-5873e\"\n"
咨询阿里云技术,反馈说是一个内核bug 导致,执行如下命令:
是内核的bug引起的偶发问题,在Docker daemon重启后会出现,参考issue: https://github.com/docker/docker/issues/25039
解决办法是,service docker stop; ip link del ov-000100-5873e; ip link del vx-000100-5873e; rm -rf /var/run/docker/netns /var/lib/docker/network/files/local-kv.db; service docker start
随后发现容器都正常启动了,后排出docker 日志
发现:
become an orphan, killing it
panic: sync: WaitGroup is reused before previous Wait has returned
goroutine 323102179 [running]:
panic(0x16f4be0, 0xc82272c1a0)
/usr/local/go/src/runtime/panic.go:481 +0x3e6
sync.(*WaitGroup).Wait(0xc8229a05c0)
/usr/local/go/src/sync/waitgroup.go:129 +0x114
github.com/docker/docker/daemon.(*Daemon).StateChanged(0xc820351ba0, 0xc82316d0c0, 0x40, 0x1d9dc60, 0xc, 0x0, 0xc82316d140, 0x40, 0x0, 0x0, ...)
/usr/src/docker/.gopath/src/github.com/docker/docker/daemon/monitor.go:68 +0xecc
github.com/docker/docker/libcontainerd.(*container).handleEvent.func1()
/usr/src/docker/.gopath/src/github.com/docker/docker/libcontainerd/container_linux.go:185 +0xab
github.com/docker/docker/libcontainerd.(*queue).append.func1(0xc820837c01, 0xc8202891a0, 0xc822ef12f0, 0xc8232d8cc0)
/usr/src/docker/.gopath/src/github.com/docker/docker/libcontainerd/queue_linux.go:26 +0x47
created by github.com/docker/docker/libcontainerd.(*queue).append
/usr/src/docker/.gopath/src/github.com/docker/docker/libcontainerd/queue_linux.go:28 +0x1da
SIOCDELRT: No such process
/var/run/docker.sock is up
time="2017-04-02T21:04:19.814311352+08:00" level=info msg="libcontainerd: new containerd process, pid: 21452"
ERRO[0000] containerd: notify OOM events error=open /proc/7779/cgroup: no such file or directory
time="2017-04-02T21:04:20.937754879+08:00" level=warning msg="libcontainerd: unknown container 3c54af257d92770298120925e5fc73499d3ece44b73cac03b85afdf04e7eee7e"
time="2017-04-02T21:04:22.364844895+08:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
time="2017-04-02T21:04:22.364949983+08:00" level=info msg="Initializing discovery with TLS"
time="2017-04-02T21:04:22.366145231+08:00" level=warning msg="Your kernel does not support swap memory limit."
time="2017-04-02T21:04:22.366468724+08:00" level=warning msg="mountpoint for pids not found"
time="2017-04-02T21:04:22.381722127+08:00" level=info msg="Loading containers: start."
谷歌后发现是docker 的一个bug 升级docker 版本后,继续观察一段时间再看。
更多推荐
所有评论(0)