Ubuntu20.04 + kubeadm + 1.28.0 + flannel vxlan
本文描述了如何使用kubeadm搭建flannel cni的k8s实验环境,给出了详细的步骤和基本运维方法。
环境信息
主机 | PowerEdge-R440 | Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 4x8 128G 2T |
虚拟机(VirtualBox 3台) | OS: ubuntu 20.04 | 4c, 16G, 64G |
Net: bridge bond0.11 | 10.250.11.186-188 | |
容器 | Containerd | 1.7.2 |
Cni-plugin | 1.3.0 | |
IPv6地址 | Disabled | 详见部署步骤。 |
部署准备
注意,下边的部分指令会使用wget下载安装文件,如果访问github等外部链接有问题,可以尝试提前下载好。
修改hostname
保证各个node节点的hostname唯一
安装基础依赖
$ apt-get update
$ apt-get install -y apt-transport-https ca-certificates curl
安装containerd & cni
下载containerd
$ wget https://github.com/containerd/containerd/releases/download/v1.7.2/containerd-1.7.2-linux-amd64.tar.gz
$ tar Cxzvf /usr/local containerd-1.7.2-linux-amd64.tar.gz
配置containerd 服务
Kubernetes 1.20版之后废弃Docker容器运行时。如果仍旧想用docker,可以使用docker-shim组件。
最好方案是使用containerd。它的运维使用方法和docker命令有很多相似之处。详见后文部署部分提及的ctr命令。
$ wget https://raw.githubusercontent.com/containerd/containerd/main/containerd.service -O /lib/systemd/system/containerd.service
$ systemctl daemon-reload
$ systemctl enable --now containerd
安装runc
$ wget https://github.com/opencontainers/runc/releases/download/v1.1.7/runc.amd64
$ install -m 755 runc.amd64 /usr/local/sbin/runc
安装cni
$ wget https://github.com/containernetworking/plugins/releases/download/v1.3.0/cni-plugins-linux-amd64-v1.3.0.tgz
$ mkdir -p /opt/cni/bin
$ tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.3.0.tgz
启动containerd
$ rm -rf /etc/containerd/config.toml
$ systemctl restart containerd
配置内核模块
$ cat <<EOF | tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
$ modprobe overlay
$ modprobe br_netfilter
# sysctl params required by setup, params persist across reboots
$ cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# Apply sysctl params without reboot
$ sysctl --system
$ lsmod | grep br_netfilter
$ lsmod | grep overlay
$ sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
禁用IPv6网络
临时禁用:
$ sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1
$ sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1
$ sudo sysctl -w net.ipv6.conf.lo.disable_ipv6=1
永久禁用:
编辑/etc/sysctl.conf 文件,添加:
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
为了验证IPv4 通用通用场景,暂时禁用IPv6,稍后再研究双栈方式下的配置方法。
禁用swap分区
$ swapoff -a # 临时禁用
$ vim /etc/fstab #永久禁用swap,
删除或注释掉/etc/fstab里的swap设备的挂载命令即可
#/dev/mapper/centos-swap swap swap defaults 0 0
重启机器
生效以上配置。
部署k8s环境
安装三组件
$ echo deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main > /etc/apt/sources.list.d/kubernetes.list
$ curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | \
apt-key add -
$ apt-get update
$ apt-get install -y kubelet kubeadm kubectl
$ apt-mark hold kubelet kubeadm kubectl
生成部署yaml文件
执行:
$ kubeadm config print init-defaults > kubeadm-init-defaults.yaml
生成的文件内容如下,# 部分为修改之后的状态:
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.250.11.186 ##
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: ubuntu-VirtualBox-master
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers ##
kind: ClusterConfiguration
kubernetesVersion: 1.28.0
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16 ##
serviceSubnet: 10.96.0.0/12
scheduler: {}
可以看出,文件内容共有三处修改:
- 修改localAPIEndpoint实际地址。
- 修改imagerepository为国内镜像地址,否则下载image失败。
- 添加podSubnet 部分,指明pod 地址空间。
执行image pull指令准备好k8s images
执行:
$ kubeadm config images list --config kubeadm-init-defaults.yaml # 枚举image列表
$ kubeadm config images pull --config kubeadm-init-defaults.yaml # 拉取 images 文件
拉取的镜像文件可以通过以下指令查看:
$ ctr -n k8s.io images list
注意这里的 -n k8s.io,即,镜像文件被拉取保存到k8s.io namespace,后续启动的container 也是在这个namespace下:
$ ctr -n k8s.io containers list
单独拉去pause:3.8 镜像
Kubeadm目前存在一点小问题,在启动k8s集群时始终使用pause:3.8作为起始容器,即便是上述的image list中列出的是pause:3.9,即便是本地已经存在了pause:3.9。
解决方法就是,单独下载pause:3.8 image,registry.k8s.io 国内无法访问,需要从registry.aliyuncs.com/google_containers/pause:3.8 下载,然后tag 成 registry.k8s.io/pause:3.8,具体命令为:
$ ctr -n k8s.io images pull registry.aliyuncs.com/google_containers/pause:3.8
$ ctr -n k8s.io images tag registry.aliyuncs.com/google_containers/pause:3.8 registry.k8s.io/pause:3.8
执行init 操作创建k8s master节点
执行:
$ kubeadm init --config kubeadm-init-defaults.yaml
本部分等待时间较长,直到输出:
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.250.11.186:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:36d0e4f3266fa67f6318d35ea7e04e6d17726ea826e6b54d751da1e5b53ef042
在等待过程中,可以通过以下命令查看关键组件的部署状态,提前发现部署异常:
$ journalctl -f -xeu kubelet --no-pager
很多错误都可以通过查看。
Kubeadm将k8s组件部署为服务的形式,可以通过systemctl status 查看这些服务的状态,例如:
$ systemctl status kubelet
如果部署过程出错,查明问题后可以运行以下命令重置部署失败的集群。
$ kubeadm reset
然后重新执行init:
$ kubeadm init --config kubeadm-init-defaults.yaml
部署worker节点
Worker节点的部署准备操作和上述master节点相同。
准备好后,在worker节点执行kubeadm提示命令:
$ kubeadm join 10.250.11.186:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:e5a3a61b520a4b36de5a3b4d1d2bd5d912f7dc4410ce44524a9311ab892ebfc9
可实现该worker节点的部署。
如果没有及时保存以上 kubeadm输出的join命令提示,可以通过以下命令生成join用的配置文件:
$ kubeadm config print join-defaults > kubeadm-join-defaults.yaml
kubeadm-join-defaults.yaml 文件内容为:
apiVersion: kubeadm.k8s.io/v1beta3
caCertPath: /etc/kubernetes/pki/ca.crt
discovery:
bootstrapToken:
apiServerEndpoint: 10.250.11.186:6443 ##
token: abcdef.0123456789abcdef
unsafeSkipCAVerification: true
timeout: 5m0s
tlsBootstrapToken: abcdef.0123456789abcdef
kind: JoinConfiguration
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: ubuntu-virtualbox-master ##
taints: null
其中,##部分为修改之后的值。
复制该文件到worker节点,并在worker节点上执行join:
$ kubeadm join --config kubeadm-join-defaults.yaml
当然除了生成kubeadm-join-defaults.yaml 这种方式,可以查看token,并计算ca cert hash,然后复用 --token --discovery-token-ca-cert-hash 的方式。
查看token:
$ kubectl token list
计算token ca cert hash:
$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
af9e070ea723dd2281c2ae2414c932832a012d40bc55dc9c747bb00e68602388
部署flannel
执行:
$ kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
可以将kube-flannel.yaml 文件下载到本地,再执行kubectl apply,这样便于对其中的某些参数做出调整。
部署过程中可以观察flannel pod 和coredns的状态:
$ kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-8m44t 1/1 Running 0 16h
kube-flannel kube-flannel-ds-ndzfk 1/1 Running 1 (16h ago) 17h
kube-flannel kube-flannel-ds-tt5mm 1/1 Running 0 16h
kube-system coredns-66f779496c-64vb2 1/1 Running 1 (16h ago) 17h
kube-system coredns-66f779496c-n8p4t 1/1 Running 1 (16h ago) 17h
kube-system etcd-ubuntu-virtualbox-master 1/1 Running 8 (16h ago) 17h
kube-system kube-apiserver-ubuntu-virtualbox-master 1/1 Running 8 (16h ago) 17h
kube-system kube-controller-manager-ubuntu-virtualbox-master 1/1 Running 1 (16h ago) 17h
kube-system kube-proxy-4gcww 1/1 Running 1 (16h ago) 17h
kube-system kube-proxy-9jpkm 1/1 Running 0 16h
kube-system kube-proxy-9rlwq 1/1 Running 0 16h
kube-system kube-scheduler-ubuntu-virtualbox-master 1/1 Running 8 (16h ago) 17h
也可以通过kubectl logs查看pod中flannel进程的运行状态。
测试部署结果
$ kubectl create deployment nwtest --image busybox --replicas 3 -- sleep infinity
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nwtest-86c8c67bfb-2twmj 1/1 Running 0 16h 10.244.2.2 ubuntu-virtualbox-node2 <none> <none>
nwtest-86c8c67bfb-g4jft 1/1 Running 0 16h 10.244.1.2 ubuntu-virtualbox-node1 <none> <none>
nwtest-86c8c67bfb-lm8kx 1/1 Running 0 16h 10.244.2.3 ubuntu-virtualbox-node2 <none> <none>
$ kubectl exec -it nwtest-86c8c67bfb-2twmj -- /bin/sh
/ #
/ # ping 10.244.2.2
PING 10.244.2.2 (10.244.2.2): 56 data bytes
64 bytes from 10.244.2.2: seq=0 ttl=64 time=0.059 ms
^C
--- 10.244.2.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.059/0.059/0.059 ms
/ # ping 10.244.1.2
PING 10.244.1.2 (10.244.1.2): 56 data bytes
64 bytes from 10.244.1.2: seq=0 ttl=62 time=0.733 ms
^C
--- 10.244.1.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.733/0.733/0.733 ms
/ # ping 10.244.2.3
PING 10.244.2.3 (10.244.2.3): 56 data bytes
64 bytes from 10.244.2.3: seq=0 ttl=64 time=0.154 ms
^C
--- 10.244.2.3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.154/0.154/0.154 ms
/ #
/ # exit
更多推荐
所有评论(0)