CentOS7搭建etcd服务–错误排查(k8s学习-笔录)

今天在学习k8s集群搭建环境准备时,搭建etcd服务启动后一直显示start状态,使用systemctl status etcd.service -l 查看详细信息如下

1、错误信息

[root@hdss7-21 cfg]# systemctl status etcd.service -l
● etcd.service - Etcd Server
   Loaded: loaded (/etc/systemd/system/etcd.service; disabled; vendor preset: disabled)
   Active: activating (start) since 五 2020-12-25 15:53:26 CST; 49s ago
 Main PID: 4150 (etcd)
   Memory: 17.6M
   CGroup: /system.slice/etcd.service
           └─4150 /opt/etcd/bin/etcd --name=etcd-2 --data-dir=/var/lib/etcd/default.etcd --listen-peer-urls=https://192.168.6.21:2380 --listen-client-urls=https://192.168.6.21:2379,http://127.0.0.1:2379 --advertise-client-urls=https://192.168.6.21:2379,http://127.0.0.1:2379 --initial-advertise-peer-urls=https://192.168.6.21:2380 --initial-cluster=etcd-1=https://192.168.6.12:2380,etcd-2=https://192.168.6.21:2380,etcd-3=https://192.168.6.22:2380 --initial-cluster-token=etcd-cluster --initial-cluster-state=new --cert-file=/opt/etcd/ssl/server.pem --key-file=/opt/etcd/ssl/server-key.pem --peer-cert-file=/opt/etcd/ssl/server.pem --peer-key-file=/opt/etcd/ssl/server-key.pem --trusted-ca-file=/opt/etcd/ssl/ca.pem --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem

12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.22:49034" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.12:55488" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.12:55492" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.22:49040" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.22:49038" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.12:55500" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.12:55498" (error "remote error: tls: bad certificate", ServerName "")

2、我的etcd服务配置

目录结构

[root@hdss7-21 opt]# tree etcd/
etcd/
├── bin
│   ├── etcd
│   └── etcdctl
├── cfg
│   └── etcd.conf
├── etcd.service
├── ssl
│   ├── ca.pem
│   ├── etcd-peer-key.pem
│   └── etcd-peer.pem
└── ssl-bak
    ├── ca.pem
    ├── server-key.pem
    └── server.pem

主要的配置文件:etcd.conf 和 etcd.service

etcd.conf文件是要是etcd服务启动参数配置信息:
[root@hdss7-21 cfg]# cat etcd.conf 
#[Member]
ETCD_NAME="etcd-2"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.6.21:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.6.21:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.6.21:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.6.21:2379,http://127.0.0.1:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.6.12:2380,etcd-2=https://192.168.6.21:2380,etcd-3=https://192.168.6.22:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

参数说明:可参考:etcd配置文件详解
ETCD_NAME=“etcd-2” : 节点名称
ETCD_DATA_DIR : 数据保存的目录
ETCD_LISTEN_PEER_URLS:用于监听其他etcd member的url
ETCD_LISTEN_CLIENT_URLS:对外提供服务的地址

ETCD_INITIAL_ADVERTISE_PEER_URLS:与其他节点交互信息的地址
ETCD_ADVERTISE_CLIENT_URLS:etcd客户端交互信息的地址
ETCD_INITIAL_CLUSTER:集群中所有节点的信息。
ETCD_INITIAL_CLUSTER_TOKEN:创建集群的 token,这个值每个集群保持唯一。
ETCD_INITIAL_CLUSTER_STATE:初始集群状态

etcd.service文件

赋予执行权限并拷贝改文件到/etc/systemd/system/(个人理解:实现服务注册)
文件信息如下:

[root@hdss7-12 etcd]# cat etcd.service 
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
        --name=${ETCD_NAME} \
        --data-dir=${ETCD_DATA_DIR} \
        --listen-peer-urls=${ETCD_LISTEN_PEER_URLS} \
        --listen-client-urls=${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
        --advertise-client-urls=${ETCD_ADVERTISE_CLIENT_URLS} \
        --initial-advertise-peer-urls=${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
        --initial-cluster=${ETCD_INITIAL_CLUSTER} \
        --initial-cluster-token=${ETCD_INITIAL_CLUSTER_TOKEN} \
        --initial-cluster-state=new \
        --cert-file=/opt/etcd/ssl/server.pem \
        --key-file=/opt/etcd/ssl/server-key.pem \
        --peer-cert-file=/opt/etcd/ssl/server.pem \
        --peer-key-file=/opt/etcd/ssl/server-key.pem \
        --trusted-ca-file=/opt/etcd/ssl/ca.pem \
        --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

3、问题解决

由于是从别的地方拷贝的etcd服务软件包,没有跟新证书,因此更新证书后,使用新的证书后(主要更改/etc/systemd/system/etcd.service ),如下信息:

[root@hdss7-21 system]# cat etcd.service 
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
        --name=${ETCD_NAME} \
        --data-dir=${ETCD_DATA_DIR} \
        --listen-peer-urls=${ETCD_LISTEN_PEER_URLS} \
        --listen-client-urls=${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
        --advertise-client-urls=${ETCD_ADVERTISE_CLIENT_URLS} \
        --initial-advertise-peer-urls=${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
        --initial-cluster=${ETCD_INITIAL_CLUSTER} \
        --initial-cluster-token=${ETCD_INITIAL_CLUSTER_TOKEN} \
        --initial-cluster-state=new \
        --cert-file=/opt/etcd/ssl/etcd-peer.pem \
        --key-file=/opt/etcd/ssl/etcd-peer-key.pem \
        --peer-cert-file=/opt/etcd/ssl/etcd-peer.pem \
        --peer-key-file=/opt/etcd/ssl/etcd-peer-key.pem \
        --trusted-ca-file=/opt/etcd/ssl/ca.pem \
        --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

4、重启服务

systemctl daemon-reaload
systemctl start etcd
服务ok

信息如下

[root@hdss7-21 system]# systemctl status etcd.service 
● etcd.service - Etcd Server
   Loaded: loaded (/etc/systemd/system/etcd.service; disabled; vendor preset: disabled)
   Active: active (running) since 五 2020-12-25 16:12:11 CST; 39s ago
 Main PID: 4306 (etcd)
   Memory: 11.0M
   CGroup: /system.slice/etcd.service
           └─4306 /opt/etcd/bin/etcd --name=etcd-2 --data-dir=/var/lib/etcd/default.etcd --listen-peer-urls...

12月 25 16:12:20 hdss7-21.host.com etcd[4306]: health check for peer 44cf137c1267f893 could not connec...GE")
12月 25 16:12:25 hdss7-21.host.com etcd[4306]: health check for peer 44cf137c1267f893 could not connec...OT")
12月 25 16:12:25 hdss7-21.host.com etcd[4306]: health check for peer 44cf137c1267f893 could not connec...GE")
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: peer 44cf137c1267f893 became active
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: established a TCP streaming connection with peer 44cf13...ter)
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: established a TCP streaming connection with peer 44cf13...ter)
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: established a TCP streaming connection with peer 44cf13...der)
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: established a TCP streaming connection with peer 44cf13...der)
12月 25 16:12:27 hdss7-21.host.com etcd[4306]: updated the cluster version from 3.0 to 3.3
12月 25 16:12:27 hdss7-21.host.com etcd[4306]: enabled capabilities for version 3.3
Hint: Some lines were ellipsized, use -l to show in full.

5、查看etcd服务集群状态

[root@hdss7-21 bin]# ./etcdctl cluster-health
member 44cf137c1267f893 is healthy: got healthy result from http://127.0.0.1:2379
member 47856ed020c3771a is healthy: got healthy result from http://127.0.0.1:2379
member 4f089e69d0c31399 is healthy: got healthy result from http://127.0.0.1:2379
cluster is healthy
Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐