kubernetes 高可用集群
部署规划首先准备几台服务器,计划部署3台master,3台 利用keepalived和haproxy做高可用,为了节约服务器,我将keepalived和haproxy和master一起部署。 服务器规划如下:集群版本:1.18.15-0IP主机名172.16.0.10k8svip172.16.0.100k8s-master172.16.0.200k8s-master02172.16.0.201k8
·
部署规划
首先准备几台服务器,计划部署3台master,3台 利用keepalived和haproxy做高可用,为了节约服务器,我将keepalived和haproxy和master一起部署。 服务器规划如下:
集群版本:1.18.15-0
IP | 主机名 |
---|---|
172.16.0.10 | k8svip |
172.16.0.100 | k8s-master |
172.16.0.200 | k8s-master02 |
172.16.0.201 | k8s-master03 |
架构图
一、 环境准备
1、修改主机名
hostnamectl set-hostname k8s-master
hostnamectl set-hostname k8s-master02
hostnamectl set-hostname k8s-master03
2、修改hosts文件
vi /etc/hosts
## 添加内容
172.16.0.10 k8svip
172.16.0.100 k8s-master
172.16.0.200 k8s-master02
172.16.0.201 k8s-master03
3、关闭防火墙等
## 配置内核参数,将桥接的IPv4流量传递到iptables的链
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
## 手动加载配置文件
sysctl --system
## 防火墙关闭
systemctl stop firewalld
systemctl disable firewalld
## 将 SELinux 设置为 permissive 模式(相当于将其禁用)
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
## 关闭交换空间
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
## ip转发
echo '1' > /proc/sys/net/ipv4/ip_forward
#时间同步(在3台机运行)
yum install ntpdate -y && timedatectl set-timezone Asia/Shanghai && ntpdate time.windows.com
二、 所有master节点安装keepalive和haproxy
1、 安装keepelived和haproxy
yum install -y keepelived haproxy
2、配置文件准备
keepalived配置文件
2 配置 keepalived
cat <<EOF > /etc/keepalived/keepalived.conf
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
weight -2
fall 10
rise 2
}
vrrp_instance VI_1 {
state ${STATE}
interface ${INTERFACE}
virtual_router_id ${ROUTER_ID}
priority ${PRIORITY}
authentication {
auth_type PASS
auth_pass ${AUTH_PASS}
}
virtual_ipaddress {
${APISERVER_VIP}
}
track_script {
check_apiserver
}
}
EOF
在上面的文件中替换自己相应的内容:
/etc/keepalived/check_apiserver.sh 定义检查脚本路径
${STATE} 如果是主节点 则为MASTER 其他则为 BACKUP。我这里选择k8s-master01为MASTER;k8s-master02 、k8s-master03为BACKUP;
${INTERFACE} 是网络接口,即服务器网卡的,我的服务器均为eth0;
${ROUTER_ID} 这个值只要在keepalived集群中保持一致即可,我使用的是默认值51;
${PRIORITY} 优先级,在master上比在备份服务器上高就行了。我的master设为100,备份服务50;
${AUTH_PASS} 这个值只要在keepalived集群中保持一致即可;
${APISERVER_VIP} 就是VIP的地址,我的为:172.16.0.10。
3 配置 keepalived健康检查
- 在上面的配置中我们也配置健康检查的参数,比如检查间隔时间,权重等等。 创建脚本
vi /etc/keepalived/check_apiserver.sh
### 添加内容
#!/bin/sh
errorExit() {
echo "*** $*" 1>&2
exit 1
}
curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"
if ip addr | grep -q ${APISERVER_VIP}; then
curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"
fi
${APISERVER_VIP} 就是VIP的地址,172.16.0.10;
${APISERVER_DEST_PORT} 这个是定义API Server交互的负载均衡端口,其实就是HAProxy绑定前端负载均衡的端口号,因为HAProxy和k8s一起部署,这里做一个区分,我使用了8443,这个下面会说到。
4、配置haproxy
# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
log /dev/log local0
log /dev/log local1 notice
daemon
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 1
timeout http-request 10s
timeout queue 20s
timeout connect 5s
timeout client 20s
timeout server 20s
timeout http-keep-alive 10s
timeout check 10s
#---------------------------------------------------------------------
# apiserver frontend which proxys to the masters
#---------------------------------------------------------------------
frontend apiserver
bind *:${APISERVER_DEST_PORT}
mode tcp
option tcplog
default_backend apiserver
#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
option httpchk GET /healthz
http-check expect status 200
mode tcp
option ssl-hello-chk
balance roundrobin
server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
上面的配置只需要修改为如下内容:
${APISERVER_DEST_PORT} 这个值是定义前端负载均衡端口,同上面的健康检查脚本里面的值一样,我这里使用8443;
${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} 这里定义的是后端节点的IP:Prot,比如我的配置是:
### server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
server k8s-master 172.16.0.100:6443 check
server k8s-master02 172.16.0.200:6443 check
server k8s-master03 172.16.0.201:6443 check
5、启动服务并设置开机自启
systemctl enable haproxy --now
systemctl enable keepalived --now
因为后端服务还未启动所以无法探测到服务,请忽略下面的提示:
[root@k8s-master ~]#
Broadcast message from systemd-journald@k8s-master (Wed 2021-04-21 14:39:55 CST):
haproxy[4758]: backend apiserver has no server available!
Broadcast message from systemd-journald@k8s-master (Wed 2021-04-21 14:39:55 CST):
haproxy[4758]: backend apiserver has no server available!
Message from syslogd@k8s-master at Apr 21 14:39:55 ...
haproxy[4758]:backend apiserver has no server available!
Message from syslogd@k8s-master at Apr 21 14:39:55 ...
haproxy[4758]:backend apiserver has no server available!
四、安装docker
#1 安装必要的一些系统工具
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
# 2: 添加软件源信息
sudo yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 3: 更新并安装Docker-CE
sudo yum makecache fast
sudo yum -y install docker-ce
# 4: 开启Docker服务
sudo systemctl start docker && systemctl enable docker
# 注意:
# 官方软件源默认启用了最新的软件,您可以通过编辑软件源的方式获取各个版本的软件包。例如官方并没有将测试版本的软件源置为可用,您可以通过以下方式开启。同理可以开启各种测试版本等。
# vim /etc/yum.repos.d/docker-ee.repo
# 将[docker-ce-test]下方的enabled=0修改为enabled=1
#
# 安装指定版本的Docker-CE:
# Step 1: 查找Docker-CE的版本:
# yum list docker-ce.x86_64 --showduplicates | sort -r
# Loading mirror speeds from cached hostfile
# Loaded plugins: branch, fastestmirror, langpacks
# docker-ce.x86_64 17.03.1.ce-1.el7.centos docker-ce-stable
# docker-ce.x86_64 17.03.1.ce-1.el7.centos @docker-ce-stable
# docker-ce.x86_64 17.03.0.ce-1.el7.centos docker-ce-stable
# Available Packages
# Step2: 安装指定版本的Docker-CE: (VERSION例如上面的17.03.0.ce.1-1.el7.centos)
# sudo yum -y install docker-ce-[VERSION]
# docker镜像加速,"https://s2q9fn53.mirror.aliyuncs.com"这个地址建议自己登陆阿里云,在容器镜像服务中找到。
# 可以通过修改daemon配置文件/etc/docker/daemon.json来使用加速器
sudo mkdir -p /etc/docker
sudo cat >> /etc/docker/daemon.json <<-'EOF'
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": ["https://s2q9fn53.mirror.aliyuncs.com"]
}
EOF
sudo systemctl daemon-reload && sudo systemctl restart docker
五、安装kubelet、kubeadm、kubectl
- kubeadm:用来初始化集群的指令。
- kubelet:在集群中的每个节点上用来启动 pod 和容器等。
- kubectl:用来与集群通信的命令行工具。
kubeadm 不能 帮您安装或者管理 kubelet 或 kubectl,所以您需要确保它们与通过 kubeadm 安装的控制平面的版本相匹配。 如果不这样做,则存在发生版本偏差的风险,可能会导致一些预料之外的错误和问题。 然而,控制平面与 kubelet 间的相差一个次要版本不一致是支持的,但 kubelet 的版本不可以超过 API 服务器的版本。 例如,1.7.0 版本的 kubelet 可以完全兼容 1.8.0 版本的 API 服务器,反之则不可以。
#添加kubernetes阿里YUM源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet-1.18.15-0 kubeadm-1.18.15-0 kubectl-1.18.15-0 && systemctl enable kubelet && systemctl start kubelet
六、初始化master集群
#注意,kubeadm init 前,先准备k8s运行所需的容器
#可查询到kubernetes所需镜像
[root@k8s-master ~]# kubeadm config images list
I0421 14:58:20.854664 5292 version.go:252] remote version is much newer: v1.21.0; falling back to: stable-1.18
W0421 14:58:23.294529 5292 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
#下面显示的就是这个版本所需要下载的镜像
k8s.gcr.io/kube-apiserver:v1.18.18
k8s.gcr.io/kube-controller-manager:v1.18.18
k8s.gcr.io/kube-scheduler:v1.18.18
k8s.gcr.io/kube-proxy:v1.18.18
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.7
#写了个sh脚本,把所需的镜像拉下来
cat >> alik8simages.sh << EOF
#!/bin/bash
list='kube-apiserver:v1.18.18
kube-controller-manager:v1.18.18
kube-scheduler:v1.18.18
kube-proxy:v1.18.18
pause:3.2
etcd:3.4.13-0
coredns:coredns:1.6.7'
for item in \$list
do
docker pull registry.aliyuncs.com/google_containers/\$item && docker tag registry.aliyuncs.com/google_containers/\$item k8s.gcr.io/\$item && docker rmi registry.aliyuncs.com/google_containers/\$item
done
EOF
#运行脚本下载,可能会失败,失败了就是网络问题多执行即便就行
bash alik8simages.sh
#检查镜像
[root@k8s-master ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.18.18 8bd0db6f4d0a 6 days ago 117MB
k8s.gcr.io/kube-apiserver v1.18.18 5745154baa89 6 days ago 173MB
k8s.gcr.io/kube-controller-manager v1.18.18 9fb627f53264 6 days ago 162MB
k8s.gcr.io/kube-scheduler v1.18.18 fe100f0c6984 6 days ago 96.1MB
k8s.gcr.io/etcd 3.4.13-0 0369cf4303ff 7 months ago 253MB
k8s.gcr.io/pause 3.2 80d28bedfe5d 14 months ago 683kB
#k8s主节点初始化 注意一下操作仅需要在master01节点执行
1、生成初始化配置文件
[root@k8s-master ~]# kubeadm config print init-defaults >kubeadm-config.yml
2、修改配置文件:
[root@k8s-master ~]# cat kubeadm-config.yml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
“advertiseAddress: 172.16.0.100” #修改为本地监听地址
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "172.16.0.10:8443" #修改为负载均衡地址和端口
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.18.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
scheduler: {}
#初始化开始
[root@k8s-master ~]# kubeadm init --config kubeadm-config.yml
W0421 15:13:55.397335 6965 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.0
[preflight] Running pre-flight checks
[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 19.03
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.18.0: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
...
注意kubeadm config images list 有时候这个命令查的版本不是真正需要的镜像就会报上面的错误这时候就需要重新下载报错提示v1.18.0的镜像,而后重试初始化
保存好如下内容
#该内容是其他master加入需要执行的命令
kubeadm join 172.16.0.10:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:bfdc983afbdbe560a0ebf2c1bf1007c22b04a76a441fb49ad2e955ad2d588977 \
--control-plane
#该内容是node节点加入需要执行的命令
kubeadm join 172.16.0.10:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:bfdc983afbdbe560a0ebf2c1bf1007c22b04a76a441fb49ad2e955ad2d588977
9、配置API所需的配置文件
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
10、安装集群网络组件flannel
[root@k8s-node01 ~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
回显信息
podsecuritypolicy.extensions/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created
或安装cannl
]# kubectl apply -f https://docs.projectcalico.org/v3.manifests/canal.yaml
configmap/canal-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/canal-flannel created
clusterrolebinding.rbac.authorization.k8s.io/canal-calico created
daemonset.apps/canal created
serviceaccount/canal created
master02 master03加入集群
1、创建pki和etcd目录
[root@k8s-master03 kubernetes]# mkdir -p /etc/kubernetes/pki/etcd
2、从master01 拷贝证书至两个节点 使用该脚本需要提前做好免密登入
注意需要在两个从节点提前在/etc/kubernetes/目录下创建这两个目录pki/etcd/
mkdir -p /etc/kubernetes/pki/etcd/
[root@k8s-master ~]# cat cpkey.sh
USER=root # 账号
CONTROL_PLANE_IPS="172.16.0.200 172.16.0.201" #节点IP
dir=/etc/kubernetes/pki/
for host in ${CONTROL_PLANE_IPS}; do
scp /etc/kubernetes/pki/ca.crt "${USER}"@$host:${dir}
scp /etc/kubernetes/pki/ca.key "${USER}"@$host:${dir}
scp /etc/kubernetes/pki/sa.key "${USER}"@$host:${dir}
scp /etc/kubernetes/pki/sa.pub "${USER}"@$host:${dir}
scp /etc/kubernetes/pki/front-proxy-ca.crt "${USER}"@$host:${dir}
scp /etc/kubernetes/pki/front-proxy-ca.key "${USER}"@$host:${dir}
scp /etc/kubernetes/pki/etcd/ca.crt "${USER}"@$host:${dir}etcd
#如果您使用的是外部etcd,请引用此行
scp /etc/kubernetes/pki/etcd/ca.key "${USER}"@$host:${dir}etcd
done
3、#master02 master03加入集群
[root@k8s-master03 kubernetes]#kubeadm join 172.16.0.10:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:bfdc983afbdbe560a0ebf2c1bf1007c22b04a76a441fb49ad2e955ad2d588977
4、配置API所需的配置文件
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
最后查看集群状态
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 58m v1.18.15
k8s-master02 Ready master 15m v1.18.15
k8s-master03 Ready master 5m39s v1.18.15
遇到的问题
该问题是因为没有将主master的证书拷贝过来导致无法连接api-server容器
[root@k8s-master02 ~]# kubeadm join 172.16.0.10:8443 --token abcdef.0123456789abcdef \
> --discovery-token-ca-cert-hash sha256:bfdc983afbdbe560a0ebf2c1bf1007c22b04a76a441fb49ad2e955ad2d588977 \
> --control-plane
[preflight] Running pre-flight checks
[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 19.03
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight:
One or more conditions for hosting a new control plane instance is not satisfied.
failure loading certificate for CA: couldn't load the certificate file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory
Please ensure that:
* The cluster has a stable controlPlaneEndpoint address.
* The certificates that must be shared among control plane instances are provided.
To see the stack trace of this error execute with --v=5 or higher
更多推荐
已为社区贡献1条内容
所有评论(0)