这里演示一个 keepalived vip 有时可用,有时不可用的情况。

背景:

集群中有两个 vip,一个用于 apiserver,一个用于 ingress。

现象:

刚部署的时候,apiserver vip 和 ingress vip都跑通了,即可以 ping 通,过了几天发现其中一个不通了。

-[root@k8s-node-1 appuser]# ping 10.130.14.154   
PING 10.130.14.154 (10.130.14.154) 56(84) bytes of data.
From 10.130.14.154 icmp_seq=1 Destination Host Unreachable
From 10.130.14.154 icmp_seq=2 Destination Host Unreachable
^C
--- 10.130.14.154 ping statistics ---
3 packets transmitted, 0 received, +2 errors, 100% packet loss, time 2000ms
pipe 2

-[root@k8s-node-1 appuser]# ping 10.130.14.155
PING 10.130.14.155 (10.130.14.155) 56(84) bytes of data.
64 bytes from 10.130.14.155: icmp_seq=1 ttl=64 time=0.343 ms
64 bytes from 10.130.14.155: icmp_seq=2 ttl=64 time=0.199 ms
64 bytes from 10.130.14.155: icmp_seq=3 ttl=64 time=0.284 ms
^C
--- 10.130.14.155 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.199/0.275/0.343/0.060 ms

通过 docker logs keepalived 查看其日志,出现如下错误:

ip address associated with VRID not present in received packet : 10.130.14.154
one or more VIP associated with VRID mismatch actual MASTER advert
bogus VRRP packet received on eth0 !!!
VRRP_Instance(vip) ignoring received advertisment...

这个问题其实是由于 VRID 冲突引起,即同一个集群中出现了相同的 VRID

查看 apiserver 的 keeplived,发现 vrid=52

-[root@k8s-master-1 appuser]# cat /usr/bin/docker-run-keepalived 
#!/bin/bash

cd `dirname $0`

docker rm -f keepalived | true > /dev/null 2>&1

# delete vip dev to avoid old vip existed
ip a | grep 10.130.14.155 > /dev/null 2>&1
[[ $? == 0 ]] && sudo ip a del 10.130.14.155 dev eth0 > /dev/null 2>&1

docker run --name=keepalived \
    --restart=always \
    --net=host \
    --privileged=true \
    --volume=/etc/keepalived/:/ka-data/scripts/ \
    -d solnet-cloud/docker-keepalived:1.2.7 \
    --master \
    --override-check check-haproxy-status.sh 2>&1 \
    --enable-check \
    --auth-pass pass \
    --vrid 52 eth0 101 10.130.14.155/24/eth0 > /dev/null 2>&1

查看 ingress 的 keeplived,发现也是 vrid=52

-[root@k8s-node-1 appuser]# cat /usr/bin/docker-run-keepalived 
#!/bin/bash

cd `dirname $0`

docker rm -f keepalived | true > /dev/null 2>&1

# delete vip dev to avoid old vip existed
ip a | grep 10.130.14.154 > /dev/null 2>&1
[[ $? == 0 ]] && sudo ip a del 10.130.14.154 dev eth0 > /dev/null 2>&1

docker run --name=keepalived \
    --restart=always \
    --net=host \
    --privileged=true \
    --volume=/etc/keepalived/:/ka-data/scripts/ \
    -d solnet-cloud/docker-keepalived:1.2.7 \
    --master \
    --override-check check-haproxy-status.sh 2>&1 \
    --enable-check \
    --auth-pass pass \
    --vrid 52 eth0 101 10.130.14.154/24/eth0 > /dev/null 2>&1

解决办法;

修改keepalived配置文件中的 virtual_router_id (vrid),这里由于是 docker 容器,通过 vrid 暴露出来了,所以修改 --vrid为不同值即可。
必须满足如下条件:
同一集群的 keepalived 的主、备机的 vrid 必须相同,取值 0~255, 但是同一内网中不应有相同其它 vrid 的集群。

重新部署后,由于使用了不同的 vrid,重新启动了新的进程,日志如下:

-[root@k8s-node-1 appuser]# docker logs keepalived       
sudo: unable to resolve host k8s-node-1.novalocal
Starting Healthcheck child process, pid=18
Starting VRRP child process, pid=19
Initializing ipvs 2.6
Interface queue is empty
No such interface, ovs-system
No such interface, tun0
No such interface, vxlan_sys_4789
No such interface, br0
No such interface, docker0
No such interface, br-c9e9615e72aa
No such interface, veth424549a
No such interface, vethaf9ec35
No such interface, cali1a64f1a3e1c
No such interface, cali21b17b4a48a
No such interface, cali4a14bd8b4e5
No such interface, cali7250247dc2f
No such interface, cali78bf6b62595
No such interface, cali3e9a255e12a
No such interface, calic0dac34c935
No such interface, cali15a5dd04656
No such interface, calib91e1e2d01a
No such interface, cali2b2c5996064
No such interface, cali1ffb56f5c0c
No such interface, califfd10184d40
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Initializing ipvs 2.6
IPVS: Can't initialize ipvs: Protocol not available
IPVS: Can't initialize ipvs: Protocol not available
Interface queue is empty
No such interface, ovs-system
No such interface, tun0
No such interface, vxlan_sys_4789
No such interface, br0
No such interface, docker0
No such interface, br-c9e9615e72aa
No such interface, veth424549a
No such interface, vethaf9ec35
No such interface, cali1a64f1a3e1c
No such interface, cali21b17b4a48a
No such interface, cali4a14bd8b4e5
No such interface, cali7250247dc2f
No such interface, cali78bf6b62595
No such interface, cali3e9a255e12a
No such interface, calic0dac34c935
No such interface, cali15a5dd04656
No such interface, calib91e1e2d01a
No such interface, cali2b2c5996064
No such interface, cali1ffb56f5c0c
No such interface, califfd10184d40
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Opening file '/etc/keepalived/keepalived.conf'.
Configuration is using : 68483 Bytes
Using LinkWatch kernel netlink reflector...
Opening file '/etc/keepalived/keepalived.conf'.
Configuration is using : 8138 Bytes
Using LinkWatch kernel netlink reflector...
VRRP_Script(check_script) succeeded
VRRP_Instance(vip) Transition to MASTER STATE
VRRP_Instance(vip) Entering MASTER STATE

现在两个 vip 都通了。

-[root@k8s-node-1 appuser]# ping 10.130.14.154   
PING 10.130.14.154 (10.130.14.154) 56(84) bytes of data.
64 bytes from 10.130.14.154: icmp_seq=1 ttl=64 time=0.079 ms
64 bytes from 10.130.14.154: icmp_seq=2 ttl=64 time=0.042 ms
^C
--- 10.130.14.154 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.042/0.060/0.079/0.020 ms
-[root@k8s-node-1 appuser]# ping 10.130.14.155
PING 10.130.14.155 (10.130.14.155) 56(84) bytes of data.
64 bytes from 10.130.14.155: icmp_seq=1 ttl=64 time=0.343 ms
64 bytes from 10.130.14.155: icmp_seq=2 ttl=64 time=0.199 ms
64 bytes from 10.130.14.155: icmp_seq=3 ttl=64 time=0.284 ms
^C
--- 10.130.14.155 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.199/0.275/0.343/0.060 ms
Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐