环境说明

  1. zookeepe镜像版本:wurstmeister/zookeeper latest 3f43f72cb283 3 years ago 510MB

问题说明

我有一次 zookeeper 容器启动过程中,不小心强制关机了,所以可能破坏了 zookeeper 某些数据,导致启动失败

zookeeper 启动报错日志

ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.13/bin/../conf/zoo.cfg
mkdir: cannot create directory '': No such file or directory
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.server.quorum.QuorumPeerConfig).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Invalid config, exiting abnormally

解决方案

zookeeper 容器名称是:kafka-docker_zookeeper_1

1. 查看该容器挂载目录

docker inspect kafka-docker_zookeeper_1
[
    {
        "Mounts": [
                    {
                        "Type": "volume",
                        "Name": "ee425b5732918a15e1f3da9c9fb617d93b47cc0c67086c34005fa18e94f60e02",
                        "Source": "/var/lib/docker/volumes/ee425b5732918a15e1f3da9c9fb617d93b47cc0c67086c34005fa18e94f60e02/_data",
                        "Destination": "/opt/zookeeper-3.4.13/conf",
                        "Driver": "local",
                        "Mode": "rw",
                        "RW": true,
                        "Propagation": ""
                    },
                    {
                        "Type": "volume",
                        "Name": "9690de948c110633badd453f11e9bca3fb3692e88f54ae40f9e726ad0c64332a",
                        "Source": "/var/lib/docker/volumes/9690de948c110633badd453f11e9bca3fb3692e88f54ae40f9e726ad0c64332a/_data",
                        "Destination": "/opt/zookeeper-3.4.13/data",
                        "Driver": "local",
                        "Mode": "rw",
                        "RW": true,
                        "Propagation": ""
                    }
                ],
    }
]

2. 查看该容器挂载目录的配置文件

进入容器挂载目录

# 进入容器挂载目录
cd /var/lib/docker/volumes/ee425b5732918a15e1f3da9c9fb617d93b47cc0c67086c34005fa18e94f60e02/_data

目录文件如下所示:

[root@centos _data]# ll
总用量 4
-rw-r--r--. 1 501 ftp 535 630 2018 configuration.xsl
-rw-r--r--. 1 501 ftp   0 1212 15:07 log4j.properties
-rw-r--r--. 1 501 ftp   0 1212 15:07 zoo.cfg

发现三个文件中,log4j.propertieszoo.cfg 两个文件大小都是 0,两个配置文件是空的,查看 zoo.cfg 内容,发现确实是空的,猜测 Zookeeper 启动失败正是因为配置文件的问题,需要补全。

3. 尝试查找旧的完整配置文件

zookeeper 以前有正常启动过的,都会在以下目录动态生成挂载目录,所以,在该目录下查找 zoo.cfg 文件,根据时间,找到以前生成的 zoo.cfg配置文件

find /var/lib/docker/volumes/ -name zoo.cfg
[root@jiewli _data]# find /var/lib/docker/volumes/ -name zoo.cfg
/var/lib/docker/volumes/ee425b5732918a15e1f3da9c9fb617d93b47cc0c67086c34005fa18e94f60e02/_data/zoo.cfg
/var/lib/docker/volumes/97b1950a9806f29eb4e4e81eb76ca2bd5d7700a72bc74bbb4324b13c446db701/_data/zoo.cfg
/var/lib/docker/volumes/7e9b3a353466085395fd3a2a4bede7b974a0ae8101ee3d87f1e15c1c17d36a38/_data/zoo.cfg
/var/lib/docker/volumes/64162e6eb0c30bc890fe07ac3acd60f8331131a5d8b5e55467d965e5573ee626/_data/zoo.cfg

找到了 4 个,找到其中日期为以前的:

[root@jiewli _data]# cd /var/lib/docker/volumes/7e9b3a353466085395fd3a2a4bede7b974a0ae8101ee3d87f1e15c1c17d36a38/_data
[root@jiewli _data]# ll
总用量 12
-rw-r--r--. 1 501 ftp  535 630 2018 configuration.xsl
-rw-r--r--. 1 501 ftp 2161 630 2018 log4j.properties
-rw-r--r--. 1 501 ftp  934 120 2019 zoo.cfg

查看 zoo.cfg 内容:

cat /var/lib/docker/volumes/7e9b3a353466085395fd3a2a4bede7b974a0ae8101ee3d87f1e15c1c17d36a38/_data/zoo.cfg

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/opt/zookeeper-3.4.13/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

查看 log4j.properties 内容:

cat /var/lib/docker/volumes/7e9b3a353466085395fd3a2a4bede7b974a0ae8101ee3d87f1e15c1c17d36a38/_data/log4j.properties

# Define some default values that can be overridden by system properties
zookeeper.root.logger=INFO, CONSOLE
zookeeper.console.threshold=INFO
zookeeper.log.dir=.
zookeeper.log.file=zookeeper.log
zookeeper.log.threshold=DEBUG
zookeeper.tracelog.dir=.
zookeeper.tracelog.file=zookeeper_trace.log

#
# ZooKeeper Logging Configuration
#

# Format is "<default threshold> (, <appender>)+

# DEFAULT: console appender only
log4j.rootLogger=${zookeeper.root.logger}

# Example with rolling log file
#log4j.rootLogger=DEBUG, CONSOLE, ROLLINGFILE

# Example with rolling log file and tracing
#log4j.rootLogger=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE

#
# Log INFO level and above messages to the console
#
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=${zookeeper.console.threshold}
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n

#
# Add ROLLINGFILE to rootLogger to get log file output
#    Log DEBUG level and above messages to a log file
log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.ROLLINGFILE.Threshold=${zookeeper.log.threshold}
log4j.appender.ROLLINGFILE.File=${zookeeper.log.dir}/${zookeeper.log.file}

# Max log file size of 10MB
log4j.appender.ROLLINGFILE.MaxFileSize=10MB
# uncomment the next line to limit number of backup files
#log4j.appender.ROLLINGFILE.MaxBackupIndex=10

log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n


#
# Add TRACEFILE to rootLogger to get log file output
#    Log DEBUG level and above messages to a log file
log4j.appender.TRACEFILE=org.apache.log4j.FileAppender
log4j.appender.TRACEFILE.Threshold=TRACE
log4j.appender.TRACEFILE.File=${zookeeper.tracelog.dir}/${zookeeper.tracelog.file}

log4j.appender.TRACEFILE.layout=org.apache.log4j.PatternLayout
### Notice we are including log4j's NDC here (%x)
log4j.appender.TRACEFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L][%x] - %m%n

查看 configuration.xsl 内容:

cat /var/lib/docker/volumes/7e9b3a353466085395fd3a2a4bede7b974a0ae8101ee3d87f1e15c1c17d36a38/_data/configuration.xsl

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html"/>
<xsl:template match="configuration">
<html>
<body>
<table border="1">
<tr>
 <td>name</td>
 <td>value</td>
 <td>description</td>
</tr>
<xsl:for-each select="property">
<tr>
  <td><a name="{name}"><xsl:value-of select="name"/></a></td>
  <td><xsl:value-of select="value"/></td>
  <td><xsl:value-of select="description"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

4. 用旧的完整配置文件覆盖疑似有问题的当前配置文件

三个文件,哪个有问题就覆盖哪个,我直接三个文件全部删掉,复制旧文件到当前挂载目录,如果没有旧文件,我已经将内容全部贴出来了,可以手动写入覆盖。

rm -f /var/lib/docker/volumes/ee425b5732918a15e1f3da9c9fb617d93b47cc0c67086c34005fa18e94f60e02/_data/*

cp /var/lib/docker/volumes/64162e6eb0c30bc890fe07ac3acd60f8331131a5d8b5e55467d965e5573ee626/_data/* /var/lib/docker/volumes/ee425b5732918a15e1f3da9c9fb617d93b47cc0c67086c34005fa18e94f60e02/_data/

5. 重启容器

docker restart kafka-docker_zookeeper_1

正常

思维扩展,类似问题排查

所有启动时报 Invalid config, exiting abnormally 错误的问题,都可以按照上面的步骤排查配置文件问题,这次是 zoo.cfg,或许也有可能是别的配置文件出问题。但只要按照上面步骤,也算是比较高效的排查方式。

Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐