部署docker集群过程中,遇到不少坑。

这里记录下方便以后查询


1.  Error initializing network controller: could not delete the default bridge network: network bridge has active endpoints

log如下:

systemctl status -l docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/docker.service.d
           └─flannel.conf
   Active: failed (Result: exit-code) since Mon 2017-01-09 21:52:02 EST; 2 days ago
     Docs: http://docs.docker.com
  Process: 1787 ExecStart=/usr/bin/docker-current daemon --ip-masq=false --exec-opt native.cgroupdriver=systemd $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY (code=exited, status=1/FAILURE)
 Main PID: 1787 (code=exited, status=1/FAILURE)


Jan 09 21:52:02 node-155 docker-current[1787]: time="2017-01-09T21:52:02.301268677-05:00" level=warning msg="Failed deleting endpoint 86a59aa2e0a3ce3d141f53280e65c7957d7bfd4ef5e0fd8d34fbaff7dde67251: failed to get endpoint from store during Delete: could not find endpoint 86a59aa2e0a3ce3d141f53280e65c7957d7bfd4ef5e0fd8d34fbaff7dde67251: []\n"
Jan 09 21:52:02 node-155 docker-current[1787]: time="2017-01-09T21:52:02.434144658-05:00" level=error msg="getEndpointFromStore for eid c1023ed6621bfe9d0c11699644aab2bd76c32c69b631e0157591e316f646e4fc failed while trying to build sandbox for cleanup: could not find endpoint c1023ed6621bfe9d0c11699644aab2bd76c32c69b631e0157591e316f646e4fc: []"
Jan 09 21:52:02 node-155 docker-current[1787]: time="2017-01-09T21:52:02.434266374-05:00" level=warning msg="Failed deleting endpoint c1023ed6621bfe9d0c11699644aab2bd76c32c69b631e0157591e316f646e4fc: failed to get endpoint from store during Delete: could not find endpoint c1023ed6621bfe9d0c11699644aab2bd76c32c69b631e0157591e316f646e4fc: []\n"
Jan 09 21:52:02 node-155 docker-current[1787]: time="2017-01-09T21:52:02.779705110-05:00" level=error msg="getEndpointFromStore for eid 2eca91d53e28075254091ced086e7b7a5282c21c0522558dba4e5198cc278ff9 failed while trying to build sandbox for cleanup: could not find endpoint 2eca91d53e28075254091ced086e7b7a5282c21c0522558dba4e5198cc278ff9: []"
Jan 09 21:52:02 node-155 docker-current[1787]: time="2017-01-09T21:52:02.779827201-05:00" level=warning msg="Failed deleting endpoint 2eca91d53e28075254091ced086e7b7a5282c21c0522558dba4e5198cc278ff9: failed to get endpoint from store during Delete: could not find endpoint 2eca91d53e28075254091ced086e7b7a5282c21c0522558dba4e5198cc278ff9: []\n"
Jan 09 21:52:02 node-155 docker-current[1787]: time="2017-01-09T21:52:02.829759654-05:00" level=fatal msg="Error starting daemon: Error initializing network controller: could not delete the default bridge network: network bridge has active endpoints


workaround:

rm -rf  /var/lib/docker/network/*

systemctl restart docker   ====>   success !!


2. docker daemon启动失败:Unable to take ownership of thin-pool 

Mar  8 22:59:02 node-152 docker-current: time="2017-03-08T22:59:02.229070332-05:00" level=fatal msg="Error starting daemon: error initializing graphdriver: devmapper: Unable to take ownership of thin-pool (docker-thinpool) that already has used data blocks"
Mar  8 22:59:02 node-152 systemd: docker.service: main process exited, code=exited, status=1/FAILURE
Mar  8 22:59:02 node-152 systemd: Failed to start Docker Application Container Engine.


原因: /var/lib/docker/devicemapper/metadata/ 内metadata丢失

workaround:

https://bugzilla.redhat.com/show_bug.cgi?id=1321640#c5

Eric Paris  2016-04-27 08:20:10 EDT
I feel like the kcs kinda misses telling users the actual problem. Nor does it really make it clear the solution.

IF you are using device mapper (instead of loopback) /var/lib/docker contains metadata informing docker about the contents of the device mapper storage area. If you delete /var/lib/docker that metadata is lost. Docker is then able to detect that the thin pool has data but docker is unable to make use of that information. The only solution is to delete the thin pool and recreate it so that both the thin pool and the metadata in /var/lib/docker will be empty.

翻译下重点:只能重建docker的storage

rm -rf /var/lib/docker/*
lvremove /dev/docker/thinpool


3.  docker run 启动缓慢


Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐