一、 环境准备

1、修改主机名

hostnamectl set-hostname k8s-master01
hostnamectl set-hostname k8s-master02
hostnamectl set-hostname k8s-master03

2、修改hosts文件
vi /etc/hosts
## 添加内容
172.16.0.10          k8svip
172.16.0.100         k8s-master01
172.16.0.200         k8s-master02
172.16.0.201         k8s-master03

3、关闭防火墙等
## 配置内核参数,将桥接的IPv4流量传递到iptables的链
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
## 手动加载配置文件
sysctl --system
## 防火墙关闭
systemctl stop firewalld
systemctl disable firewalld
## 将 SELinux 设置为 permissive 模式(相当于将其禁用)
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
## 关闭交换空间
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
## ip转发
echo '1' > /proc/sys/net/ipv4/ip_forward
#时间同步(在3台机运行)
yum install ntpdate -y && timedatectl set-timezone Asia/Shanghai  && ntpdate time.windows.com

二、 所有master节点安装keepalive和haproxy


1、 安装keepelived和haproxy 

yum install -y keepelived haproxy

 2、配置文件准备

keepalived配置文件

2 配置 keepalived
cat <<EOF > /etc/keepalived/keepalived.conf
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
    router_id LVS_DEVEL
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"   
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance VI_1 {
    state  ${STATE} 
    interface ${INTERFACE}
    virtual_router_id  ${ROUTER_ID}
    priority ${PRIORITY}
    authentication {
        auth_type PASS
        auth_pass ${AUTH_PASS}
    }
    virtual_ipaddress {
        ${APISERVER_VIP}
    }
    track_script {
        check_apiserver
    }
}

EOF
在上面的文件中替换自己相应的内容:
/etc/keepalived/check_apiserver.sh   定义检查脚本路径
${STATE}                             如果是主节点 则为MASTER 其他则为 BACKUP。我这里选择k8s-master01为MASTER;k8s-master02 、k8s-master03为BACKUP;
${INTERFACE}                         是网络接口,即服务器网卡的,我的服务器均为eth0;
${ROUTER_ID}                         这个值只要在keepalived集群中保持一致即可,我使用的是默认值51;
${PRIORITY}                          优先级,在master上比在备份服务器上高就行了。我的master设为100,备份服务50;
${AUTH_PASS}                         这个值只要在keepalived集群中保持一致即可;
${APISERVER_VIP}                     就是VIP的地址,我的为:172.16.0.10。

3 配置 keepalived健康检查
在上面的配置中我们也配置健康检查的参数,比如检查间隔时间,权重等等。
创建脚本 

vi /etc/keepalived/check_apiserver.sh
### 添加内容
#!/bin/sh

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"
if ip addr | grep -q ${APISERVER_VIP}; then
    curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"
fi


${APISERVER_VIP} 就是VIP的地址,172.16.0.10;
${APISERVER_DEST_PORT} 这个是定义API Server交互的负载均衡端口,其实就是HAProxy绑定前端负载均衡的端口号,因为HAProxy和k8s一起部署,这里做一个区分,我使用了8443,这个下面会说到。

 4、配置haproxy

# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log /dev/log local0
    log /dev/log local1 notice
    daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          20s
    timeout server          20s
    timeout http-keep-alive 10s
    timeout check           10s

#---------------------------------------------------------------------
# apiserver frontend which proxys to the masters
#---------------------------------------------------------------------
frontend apiserver
    bind *:${APISERVER_DEST_PORT}   
    mode tcp
    option tcplog
    default_backend apiserver

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    option ssl-hello-chk
    balance     roundrobin
        server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
        
        
上面的配置只需要修改为如下内容:
${APISERVER_DEST_PORT} 这个值是定义前端负载均衡端口,同上面的健康检查脚本里面的值一样,我这里使用8443;
${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} 这里定义的是后端节点的IP:Prot,比如我的配置是:
### server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
    server k8s-master01 172.16.0.100:6443 check
    server k8s-master02 172.16.0.200:6443 check
    server k8s-master03 172.16.0.201:6443 check

 5、启动服务并设置开机自启

systemctl enable haproxy --now
systemctl enable keepalived --now


因为后端服务还未启动所以无法探测到服务,请忽略下面的提示
[root@k8s-master01 ~]# 
Broadcast message from systemd-journald@k8s-master (Wed 2021-04-21 14:39:55 CST):

haproxy[4758]: backend apiserver has no server available!


Broadcast message from systemd-journald@k8s-master (Wed 2021-04-21 14:39:55 CST):

haproxy[4758]: backend apiserver has no server available!


Message from syslogd@k8s-master at Apr 21 14:39:55 ...
 haproxy[4758]:backend apiserver has no server available!

Message from syslogd@k8s-master at Apr 21 14:39:55 ...
 haproxy[4758]:backend apiserver has no server available!

 四、安装docker

在一个有网络的电脑上下载好需要的docker版本,kubectl,kubelet,kubeadm要能与之兼容。 

#1 安装必要的一些系统工具
sudo yum  install -y yum-utils device-mapper-persistent-data lvm2
# 2: 添加软件源信息
sudo yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 3: 更新并安装Docker-CE
sudo yum makecache fast
sudo yum -y install docker-ce
# 4: 开启Docker服务
sudo systemctl start docker && systemctl enable docker

# 注意:
# 官方软件源默认启用了最新的软件,您可以通过编辑软件源的方式获取各个版本的软件包。例如官方并没有将测试版本的软件源置为可用,您可以通过以下方式开启。同理可以开启各种测试版本等。
# vim /etc/yum.repos.d/docker-ee.repo
#   将[docker-ce-test]下方的enabled=0修改为enabled=1
#
# 安装指定版本的Docker-CE:
# Step 1: 查找Docker-CE的版本:
# yum list docker-ce.x86_64 --showduplicates | sort -r
#   Loading mirror speeds from cached hostfile
#   Loaded plugins: branch, fastestmirror, langpacks
#   docker-ce.x86_64            17.03.1.ce-1.el7.centos            docker-ce-stable
#   docker-ce.x86_64            17.03.1.ce-1.el7.centos            @docker-ce-stable
#   docker-ce.x86_64            17.03.0.ce-1.el7.centos            docker-ce-stable
#   Available Packages
# Step2: 安装指定版本的Docker-CE: (VERSION例如上面的17.03.0.ce.1-1.el7.centos)
# sudo yum -y install docker-ce-[VERSION]

# docker镜像加速,"https://s2q9fn53.mirror.aliyuncs.com"这个地址建议自己登陆阿里云,在容器镜像服务中找到。
# 可以通过修改daemon配置文件/etc/docker/daemon.json来使用加速器
sudo mkdir -p /etc/docker
sudo cat >> /etc/docker/daemon.json <<-'EOF'
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "registry-mirrors": ["https://s2q9fn53.mirror.aliyuncs.com"]
}
EOF
sudo systemctl daemon-reload && sudo systemctl restart docker

五、安装kubelet、kubeadm、kubectl

kubeadm:用来初始化集群的指令。
kubelet:在集群中的每个节点上用来启动 pod 和容器等。
kubectl:用来与集群通信的命令行工具。
kubeadm 不能 帮您安装或者管理 kubelet 或 kubectl,所以您需要确保它们与通过 kubeadm 安装的控制平面的版本相匹配。 如果不这样做,则存在发生版本偏差的风险,可能会导致一些预料之外的错误和问题。 然而,控制平面与 kubelet 间的相差一个次要版本不一致是支持的,但 kubelet 的版本不可以超过 API 服务器的版本。 例如,1.7.0 版本的 kubelet 可以完全兼容 1.8.0 版本的 API 服务器,反之则不可以。

#添加kubernetes阿里YUM源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

yum install -y kubelet-1.18.15-0 kubeadm-1.18.15-0 kubectl-1.18.15-0 && systemctl enable kubelet && systemctl start kubelet

 六、初始化master集群

#注意,kubeadm init 前,先准备k8s运行所需的容器
#可查询到kubernetes所需镜像
[root@k8s-master 01~]# kubeadm config images list
I0421 14:58:20.854664    5292 version.go:252] remote version is much newer: v1.21.0; falling back to: stable-1.18
W0421 14:58:23.294529    5292 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
#下面显示的就是这个版本所需要下载的镜像
k8s.gcr.io/kube-apiserver:v1.18.18
k8s.gcr.io/kube-controller-manager:v1.18.18
k8s.gcr.io/kube-scheduler:v1.18.18
k8s.gcr.io/kube-proxy:v1.18.18
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.7


#写了个sh脚本,把所需的镜像拉下来
cat >> alik8simages.sh << EOF
#!/bin/bash
list='kube-apiserver:v1.18.18
kube-controller-manager:v1.18.18
kube-scheduler:v1.18.18
kube-proxy:v1.18.18
pause:3.2
etcd:3.4.13-0
coredns:coredns:1.6.7'
for item in \$list
  do

    docker pull registry.aliyuncs.com/google_containers/\$item && docker tag registry.aliyuncs.com/google_containers/\$item k8s.gcr.io/\$item && docker rmi registry.aliyuncs.com/google_containers/\$item

  done
EOF
#运行脚本下载,可能会失败,失败了就是网络问题多执行即便就行
bash alik8simages.sh

#检查镜像
[root@k8s-master ~]# docker images
REPOSITORY                           TAG        IMAGE ID       CREATED         SIZE
k8s.gcr.io/kube-proxy                v1.18.18   8bd0db6f4d0a   6 days ago      117MB
k8s.gcr.io/kube-apiserver            v1.18.18   5745154baa89   6 days ago      173MB
k8s.gcr.io/kube-controller-manager   v1.18.18   9fb627f53264   6 days ago      162MB
k8s.gcr.io/kube-scheduler            v1.18.18   fe100f0c6984   6 days ago      96.1MB
k8s.gcr.io/etcd                      3.4.13-0   0369cf4303ff   7 months ago    253MB
k8s.gcr.io/pause                     3.2        80d28bedfe5d   14 months ago   683kB

#k8s主节点初始化  注意一下操作仅需要在master01节点执行
1、生成初始化配置文件
[root@k8s-master01~]# kubeadm  config print init-defaults >kubeadm-config.yml
    

2、修改配置文件:
[root@k8s-master01~]# cat kubeadm-config.yml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  “advertiseAddress: 172.16.0.100”  #修改为本地监听地址
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: k8s-master
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "172.16.0.10:8443"  #修改为负载均衡地址和端口
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.18.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
scheduler: {}


#初始化开始

[root@k8s-master01~]# kubeadm  init --config kubeadm-config.yml 
W0421 15:13:55.397335    6965 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.0
[preflight] Running pre-flight checks
        [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 19.03
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'

error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.18.0: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    ...

注意kubeadm config images list 有时候这个命令查的版本不是真正需要的镜像就会报上面的错误这时候就需要重新下载报错提示v1.18.0的镜像,而后重试初始化或者用命令行方式:

kubeadm init \
  --apiserver-advertise-address=172.16.5.5 \
  --image-repository registry.aliyuncs.com/google_containers \
  --kubernetes-version v1.23.1 \
  --control-plane-endpoint=172.16.5.123 \
  --service-cidr=10.1.0.0/16 \
  --pod-network-cidr=10.244.0.0/16 \
  --v=5
# –image-repository string:    这个用于指定从什么位置来拉取镜像(1.13版本才有的),默认值是k8s.gcr.io,我们将其指定为国内镜像地址:registry.aliyuncs.com/google_containers
# –kubernetes-version string:  指定kubenets版本号,默认值是stable-1,会导致从https://dl.k8s.io/release/stable-1.txt下载最新的版本号,我们可以将其指定为固定版本(v1.22.1)来跳过网络请求。
# –apiserver-advertise-address  指明用 Master 的哪个 interface 与 Cluster 的其他节点通信。如果 Master 有多个 interface,建议明确指定,如果不指定,kubeadm 会自动选择有默认网关的 interface。这里的ip为master节点ip,记得更换。
# –pod-network-cidr             指定 Pod 网络的范围。Kubernetes 支持多种网络方案,而且不同网络方案对  –pod-network-cidr有自己的要求,这里设置为10.244.0.0/16 是因为我们将使用 flannel 网络方案,必须设置成这个 CIDR。
# --control-plane-endpoint     cluster-endpoint 是映射到该 IP 的自定义 DNS 名称,这里配置hosts映射:127.0.0.1   cluster-endpoint。 这将允许你将 --control-plane-endpoint=cluster-endpoint 传递给 kubeadm init,并将相同的 DNS 名称传递给 kubeadm join。 稍后你可以修改 cluster-endpoint 以指向高可用性方案中的负载均衡器的地址。

 安装集群网络组件flannel

[root@k8s-master01~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
回显信息
podsecuritypolicy.extensions/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created


或安装cannl
# kubectl apply -f https://docs.projectcalico.org/v3.manifests/canal.yaml
configmap/canal-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/canal-flannel created
clusterrolebinding.rbac.authorization.k8s.io/canal-calico created
daemonset.apps/canal created
serviceaccount/canal created
master02 master03加入集群
1、创建pki和etcd目录
[root@k8s-master03 kubernetes]# mkdir  -p /etc/kubernetes/pki/etcd

2、从master01 拷贝证书至两个节点   使用该脚本需要提前做好免密登入
[root@k8s-master01~]# cat cpkey.sh 
USER=root # 账号
CONTROL_PLANE_IPS="172.16.0.200 172.16.0.201" #节点IP
dir=/etc/kubernetes/pki/
for host in ${CONTROL_PLANE_IPS}; do
    scp /etc/kubernetes/pki/ca.crt "${USER}"@$host:${dir}
    scp /etc/kubernetes/pki/ca.key "${USER}"@$host:${dir}
    scp /etc/kubernetes/pki/sa.key "${USER}"@$host:${dir}
    scp /etc/kubernetes/pki/sa.pub "${USER}"@$host:${dir}
    scp /etc/kubernetes/pki/front-proxy-ca.crt "${USER}"@$host:${dir}
    scp /etc/kubernetes/pki/front-proxy-ca.key "${USER}"@$host:${dir}
    scp /etc/kubernetes/pki/etcd/ca.crt "${USER}"@$host:${dir}etcd
    #如果您使用的是外部etcd,请引用此行
    scp /etc/kubernetes/pki/etcd/ca.key "${USER}"@$host:${dir}etcd
done

3、#master02 master03加入集群 
[root@k8s-master03 kubernetes]#kubeadm join 172.16.0.10:8443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:bfdc983afbdbe560a0ebf2c1bf1007c22b04a76a441fb49ad2e955ad2d588977 
  
4、配置API所需的配置文件

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

 最后查看集群状态

 kubectl get nodes

 遇到的问题


该问题是因为没有将主master的证书拷贝过来导致无法连接api-server容器

[root@k8s-master02 ~]#  kubeadm join 172.16.0.10:8443 --token abcdef.0123456789abcdef \
>     --discovery-token-ca-cert-hash sha256:bfdc983afbdbe560a0ebf2c1bf1007c22b04a76a441fb49ad2e955ad2d588977 \
>     --control-plane
[preflight] Running pre-flight checks
        [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 19.03
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight: 
One or more conditions for hosting a new control plane instance is not satisfied.

failure loading certificate for CA: couldn't load the certificate file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory

Please ensure that:
* The cluster has a stable controlPlaneEndpoint address.
* The certificates that must be shared among control plane instances are provided.


To see the stack trace of this error execute with --v=5 or higher

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐