k8s Canal (by quqi99)
作者:张华 发表于:2021-01-25版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明Canal是Flannel和Calico的组合,因此它的优点也在于这两种技术的交叉。网络层用的是Flannel提供的简单overlay,可以在许多不同的部署环境中运行且无需额外的配置。在网络策略方面,Calico强大的网络规则评估,为基础网络提供了更多补充,从而提供了更多的
作者:张华 发表于:2021-01-25
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明
如OpenStack中的Secrity Group可以用来在传统服务应用架构中来实现层与层这的网络访问规则,但对于容器化的服务场景,一个容器一个应用粒度更小,且主机节点的个数和 IP地址都 是快速变化的,这样分层防火墙的思路不再可行。Calico就是这样一个工具,通过修改每个节点上的iptables与路由来实现容器间数据的路由和访问控制,并通过etcd协调节点配置信息。 K8s Network Policy可使用Calico作为CNI实现隔离性,只有匹配规则的流量才能进入pod,同理只有匹配规则的流量才可以离开pod。
Canal是Flannel与Calico的结合,访问控制部分由Calico(calico-node -felix)实现, 网络部分仍由Flannel实现(Calico有BGP路由与IPIP隧道两种,Flannel则有Vxlan隧道)。
Calico的网络规则的用法可参见(https://www.open-open.com/news/view/1a7c496), k8s通过CNI集成Canal能使用Calico实现的网络规则部分,但用法是怎样的呢(可参见-https://www.jianshu.com/p/331235d8bcbb):
- 通过kubectl client创建NetworkPolicy资源
- calico的policy-controller(calico-kube-controllers集成了calicoctl用于写policy)监听networkpolicy资源,获取到后写入calico的etcd数据库
- node上calico-felix从etcd数据库中获取policy资源,调用iptables做相应配置。
Calico policy架构
在node上最终生成的最终iptables如下:
测试环境信息
kubernetes-master/0, kubernetes-worker/0, kubernetes-worker/1的IP如下:
kubernetes-master/0* active idle 3 10.5.2.205 6443/tcp Kubernetes master running.
canal/2* active idle 10.5.2.205 Flannel subnet 10.1.49.1/24
containerd/2* active idle 10.5.2.205 Container runtime available
kubernetes-worker/0* active idle 4 10.5.3.197 80/tcp,443/tcp Kubernetes worker running.
canal/0 active idle 10.5.3.197 Flannel subnet 10.1.94.1/24
containerd/0 active idle 10.5.3.197 Container runtime available
kubernetes-worker/1 active idle 5 10.5.4.4 80/tcp,443/tcp Kubernetes worker running.
canal/1 active idle 10.5.4.4 Flannel subnet 10.1.3.1/24
containerd/1 active idle 10.5.4.4 Container runtime available
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ubuntu-debug-794979f648-4dj8w 1/1 Running 1 118m 10.1.3.14 juju-11ff05-k8s-5 <none> <none>
ubuntu-debug-794979f648-fwdmv 1/1 Running 1 118m 10.1.94.7 juju-11ff05-k8s-4 <none> <none>
canal中的flannel提供pod-to-pod通信
flannel内的tunnel网段是10.1.0.0/16,vxlan类型:
# etcdctl --cert-file /etc/ssl/flannel/client-cert.pem --key-file /etc/ssl/flannel/client-key.pem --ca-file /etc/ssl/flannel/client-ca.pem -endpoints=https://10.5.1.44:2379 get /coreos.com/network/config
{"Network": "10.1.0.0/16", "Backend": {"Type": "vxlan"}}
flannel为kubernetes-master/0, kubernetes-worker/0, kubernetes-worker/1分配的subnet分别是:
root@juju-11ff05-k8s-4:~# etcdctl --cert-file /etc/ssl/flannel/client-cert.pem --key-file /etc/ssl/flannel/client-key.pem --ca-file /etc/ssl/flannel/client-ca.pem -endpoints=https://10.5.1.44:2379 ls /coreos.com/network/subnets/
/coreos.com/network/subnets/10.1.94.0-24 (kubernetes-worker/0)
/coreos.com/network/subnets/10.1.3.0-24 (kubernetes-worker/1)
/coreos.com/network/subnets/10.1.49.0-24 (kubernetes-master/0)
flannel是根据subnet.env分配的,kubernetes-master/0, kubernetes-worker/0, kubernetes-worker/1均应有下列配置,下面只是粘出了kubernetes-worker/0上的例子:
root@juju-11ff05-k8s-4:~# cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.1.0.0/16
FLANNEL_SUBNET=10.1.94.1/24
FLANNEL_MTU=8908
FLANNEL_IPMASQ=true
root@juju-11ff05-k8s-4:~# ip addr show flannel.1 |grep global
inet 10.1.94.0/32 scope global flannel.1
防火墙应该允许该子网的流量进入
#on kubernetes-worker/0, it should ALLOW 10.1.94.0/24
root@juju-11ff05-k8s-4:~# iptables-save |grep 10.1.94 |grep POST
-A POSTROUTING ! -s 10.1.0.0/16 -d 10.1.94.0/24 -j RETURN
# iptables -t nat -nvL |grep 'Chain POSTROUTING' -A9
Chain POSTROUTING (policy ACCEPT 1447 packets, 89820 bytes)
pkts bytes target prot opt in out source destination
1885 119K KUBE-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
8 460 fan-egress all -- * * 252.0.0.0/8 0.0.0.0/0
183 11249 RETURN all -- * * 10.1.0.0/16 10.1.0.0/16
6 360 MASQUERADE all -- * * 10.1.0.0/16 !224.0.0.0/4
0 0 RETURN all -- * * !10.1.0.0/16 10.1.94.0/24
0 0 MASQUERADE all -- * * !10.1.0.0/16 10.1.0.0/16
另外,底层docker或contained也应该将etcd中为每个节点配置的subnet加载进去:
- 如果使用docker, 容器内的流量会先到docker0, 然后根据下列路由到flannel.1
- 如果使用containerd,是通过下列的/etc/cni/net.d/10-canal.conflist文件将subnet设置进flannel的。
canal中的Calico/Felix提供网络访问规则
本来calico的felix(通过calico-node -felix运行)是运行在每个节点上负责配置路由及网络访问规则的,现在路由这块不做还是由Flannel来做,所以主要用到felix的网络访问规则。calio的felix的访问规则这块由下列iptables实现(详见Calico网络模型 - https://www.cnblogs.com/menkeyi/p/11364977.html),但尚不清楚canal有没有作其他更改。
root@juju-11ff05-k8s-4:~# route -n |grep flannel
10.1.3.0 10.1.3.0 255.255.255.0 UG 0 0 0 flannel.1
10.1.49.0 10.1.49.0 255.255.255.0 UG 0 0 0 flannel.1
接着流量会被封装成vxlan tunnel送到对方endpoint.
Calico如何管网络策略的还没研究清楚(我估计上节中的充许防火墙规则就是这块设置的,当然只是猜测)。
root@juju-11ff05-k8s-4:~# ip addr show |grep cali -A1
6: cali8215eb6fd4a@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-262d6f97-ad60-b48b-85e1-ee180b620c30
--
7: cali105686d7bac@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-09304b68-97b3-c72b-2bd6-b47d9f626890
root@juju-11ff05-k8s-4:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.5.0.1 0.0.0.0 UG 100 0 0 ens3
10.1.3.0 10.1.3.0 255.255.255.0 UG 0 0 0 flannel.1
10.1.49.0 10.1.49.0 255.255.255.0 UG 0 0 0 flannel.1
10.1.94.6 0.0.0.0 255.255.255.255 UH 0 0 0 cali105686d7bac
10.1.94.7 0.0.0.0 255.255.255.255 UH 0 0 0 cali8215eb6fd4a
10.5.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3
169.254.169.254 10.5.0.1 255.255.255.255 UGH 100 0 0 ens3
252.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 fan-252
root@juju-11ff05-k8s-4:~# ip netns exec cni-09304b68-97b3-c72b-2bd6-b47d9f626890 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 169.254.1.1 0.0.0.0 UG 0 0 0 eth0
169.254.1.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0
root@juju-11ff05-k8s-4:~# ip netns exec cni-262d6f97-ad60-b48b-85e1-ee180b620c30 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 169.254.1.1 0.0.0.0 UG 0 0 0 eth0
169.254.1.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0
root@juju-11ff05-k8s-4:~# ctr c ls
CONTAINER IMAGE RUNTIME
calico-node rocks.canonical.com:443/cdk/calico/node:v3.10.1 io.containerd.runc.v2
root@juju-11ff05-k8s-4:~# ip netns
cni-09304b68-97b3-c72b-2bd6-b47d9f626890 (id: 1)
cni-262d6f97-ad60-b48b-85e1-ee180b620c30 (id: 0)
root@juju-11ff05-k8s-4:~# ip netns exec cni-09304b68-97b3-c72b-2bd6-b47d9f626890 ip addr show |grep eth0
3: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
inet 10.1.94.6/32 scope global eth0
root@juju-11ff05-k8s-4:~# ip netns exec cni-262d6f97-ad60-b48b-85e1-ee180b620c30 ip addr show |grep eth0
3: eth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
inet 10.1.94.7/32 scope global eth0
root@juju-11ff05-k8s-4:~# cat /etc/cni/net.d/10-canal.conflist
{
"name": "cdk-canal",
"cniVersion": "0.3.0",
"plugins": [
{
"type": "calico",
"etcd_endpoints": "https://10.5.1.44:2379",
"etcd_key_file": "/opt/calicoctl/etcd-key",
"etcd_cert_file": "/opt/calicoctl/etcd-cert",
"etcd_ca_cert_file": "/opt/calicoctl/etcd-ca",
"log_level": "info",
"ipam": {
"type": "host-local",
"subnet": "10.1.94.1/24"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "/root/cdk/kubeconfig"
}
},
{
"type": "portmap",
"capabilities": {"portMappings": true},
"snat": true
}
]
}
iptables提供Service-to-Pod通信
以dashboard为例:
$ kubectl get services -A |grep kubernetes-dashboard
kubernetes-dashboard dashboard-metrics-scraper ClusterIP 10.152.183.92 <none> 8000/TCP 3d22h
kubernetes-dashboard kubernetes-dashboard ClusterIP 10.152.183.89 <none> 443/TCP 3d22h
$ kubectl describe --namespace kubernetes-dashboard service kubernetes-dashboard |grep -E 'IPs|Endpoints|Port'
IPs: 10.152.183.89
Port: <unset> 443/TCP
TargetPort: 8443/TCP
Endpoints: 10.1.3.16:8443
$ kubectl get pods -A -o wide |grep dashboard
kubernetes-dashboard dashboard-metrics-scraper-74757fb5b7-9rqxd 1/1 Running 1 3d22h 10.1.3.17 juju-11ff05-k8s-5 <none> <none>
kubernetes-dashboard kubernetes-dashboard-64f87676d4-s26bs 1/1 Running 1 3d22h 10.1.3.16 juju-11ff05-k8s-5 <none> <none>
# 如果有多个endpoint也就会有多个KUBE-SEP-xxx实实现LB, 其中MARK-MASQ用于set mark
root@juju-11ff05-k8s-4:~# iptables-save |grep kubernetes-dashboard
-A KUBE-SERVICES ! -s 10.1.0.0/16 -d 10.152.183.89/32 -p tcp -m comment --comment "kubernetes-dashboard/kubernetes-dashboard cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ #egress
-A KUBE-SERVICES -d 10.152.183.89/32 -p tcp -m comment --comment "kubernetes-dashboard/kubernetes-dashboard cluster IP" -m tcp --dport 443 -j KUBE-SVC-CEZPIJSAUFW5MYPQ #ingress
-A KUBE-SVC-CEZPIJSAUFW5MYPQ -m comment --comment "kubernetes-dashboard/kubernetes-dashboard" -j KUBE-SEP-RMJCCMAWKQ3DCGZP
-A KUBE-SEP-RMJCCMAWKQ3DCGZP -s 10.1.3.16/32 -m comment --comment "kubernetes-dashboard/kubernetes-dashboard" -j KUBE-MARK-MASQ #egress
-A KUBE-SEP-RMJCCMAWKQ3DCGZP -p tcp -m comment --comment "kubernetes-dashboard/kubernetes-dashboard" -m tcp -j DNAT --to-destination 10.1.3.16:8443 #ingress
相关服务
systemctl status flannel
systemctl status calico-node.service
systemctl status snap.kubelet.daemon.service
附录 - 一个客户问题
一个客户有一个k8s over vSphere managed by juju 环境,报告使用 goldpinger发现pod间网络不通。
先是遇到问题https://bugs.launchpad.net/juju/+bug/1831244,juju controllers与vSphere之间存在用户名和密码的问题,这样"juju status --format yaml"能看到controller处于suspended状态(注意:只有带–format yaml才能看到得)。
解决1831244之后就可以正常创建新节点了,将那个失败的keycloak pod迁移到新节点,问题依然存在,'kubectl logs’看到keycloak失败原因是: ‘Caused by: java.net.UnknownHostException: postgres’
显然,keycloak依赖于postgres service, 存在dns问题。
coredns并没有从旧节新移到新节点,旧节点上存在一个问题, canal的flannel使用的subnet(/run/flannel/subnet.env)与canal的calico使用的subnet(/etc/cni/net.d/10-canal.conflist)不一致,这样导致旧节点上的iptables下列倒数第二行就弄错了,没有充许flannel的数据包。
grep -A9 "Chain POSTROUTING" sos_commands/networking/iptables_-t_nat_-nvL
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
1411K 88M cali-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:O3lYWMrLQYEMJtB5 */
1378K 86M CNI-HOSTPORT-MASQ all -- * * 0.0.0.0/0 0.0.0.0/0 /* CNI portfwd requiring masquerade */
1376K 86M KUBE-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
800K 50M RETURN all -- * * 172.17.240.0/20 172.17.240.0/20
0 0 MASQUERADE all -- * * 172.17.240.0/20 !224.0.0.0/4
0 0 RETURN all -- * * !172.17.240.0/20 172.17.247.0/24
521K 31M MASQUERADE all -- * * !172.17.240.0/20 172.17.240.0/20
下列语句可以查询etcd中每个节点的subnet配置:
juju run -u kubernetes-worker/<N> etcdctl --cert-file /etc/ssl/flannel/client-cert.pem --key-file /etc/ssl/flannel/client-key.pem --ca-file /etc/ssl/flannel/client-ca.pem -endpoints=https://10.1.234.195:2379,https://10.1.234.196:2379,https://10.1.234.203:2379 ls /coreos.com/network/subnets/
需要先从etcd里删除它 - rm /coreos.com/network/subnets/172.17.247.0-24
然后重它systemctl restart flannel.service, 接着将 /run/flannel/subnet.env的subnet更新为正确的subnet, 之后通过“ip a s flannel.1 ”确认。最后:
systemctl restart calico-node.service
systemctl restart snap.kubelet.daemon.service
calicoctl测试
juju ssh kubernetes-worker/0 -- sudo -s
#https://docs.projectcalico.org/getting-started/clis/calicoctl/install
curl -O -L https://github.com/projectcalico/calicoctl/releases/download/v3.17.1/calicoctl
cp calicoctl /usr/bin/ && chmod +x /usr/bin/calicoctl
#copy env variable like ETCD_ENDPOINTS from the output of 'ps -ef |grep calicoctl'
ETCD_ENDPOINTS=https://10.5.0.131:2379 ETCD_CA_CERT_FILE=/opt/calicoctl/etcd-ca ETCD_CERT_FILE=/opt/calicoctl/etcd-cert ETCD_KEY_FILE=/opt/calicoctl/etcd-key calicoctl get networkPolicy
ETCD_ENDPOINTS=https://10.5.0.131:2379 ETCD_CA_CERT_FILE=/opt/calicoctl/etcd-ca ETCD_CERT_FILE=/opt/calicoctl/etcd-cert ETCD_KEY_FILE=/opt/calicoctl/etcd-key calicoctl get globalNetworkPolicy
k8s networkpolicy - https://docs.projectcalico.org/security/kubernetes-network-policy
calico networkpolicy - https://docs.projectcalico.org/security/calico-network-policy
cat << EOF | sudo tee calicoPolicyTest.yaml
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
name: deny-circle-blue
spec:
selector: k8s-app == 'kubernetes-dashboard'
ingress:
- action: Deny
protocol: TCP
destination:
ports:
- 22
EOF
ETCD_ENDPOINTS=https://10.5.0.131:2379 ETCD_CA_CERT_FILE=/opt/calicoctl/etcd-ca ETCD_CERT_FILE=/opt/calicoctl/etcd-cert ETCD_KEY_FILE=/opt/calicoctl/etcd-key calicoctl get globalNetworkPolicy
#go to the worker which dashboard pos is on (kubectl get pod -A -o wide |grep dash)
iptables-save |grep 2222
reference
[1] https://courses.academy.tigera.io/courses/course-v1:tigera+CCO-L1+CCO-L1-2020/course/
更多推荐
所有评论(0)