kubespray部署kubernetes集群

zsx_yiyiyi

2653人浏览 · 2023-06-17 23:18:31

zsx_yiyiyi · 2023-06-17 23:18:31 发布

kubespray部署kubernetes集群

1、kubespray简介

Kubespray 是开源的部署生产级别 Kubernetes 集群的项目，它整合了 Ansible 作为部署的工具。
在这里插入图片描述

可以部署在 AWS，GCE，Azure，OpenStack，vSphere，Packet(Bare metal)，

Oracle Cloud Infrastructure(Experimental)或Baremetal上。
高可用集群
可组合各种组件（例如，选择网络插件）
支持最受欢迎的Linux发行版
持续集成测试

官网：https://kubespray.io

项目地址：https://github.com/kubernetes-sigs/kubespray

2、在线部署

国内特殊的网络环境导致使用 kubespray 特别困难，部分镜像需要从 gcr.io 拉取，部分二进制文件需要从

github 下载，所以可以提前下载好进行镜像导入。

说明：高可用部署 etcd 要求3个节点，所以高可用集群最少需要 3 个节点。

kubespray 需要一个部署节点，也可以复用集群任意一个节点，这里在第一个master节点( 192.168.54.211 )安装

kubespray，并执行后续的所有操作。

2.1 搭建环境准备

1、服务器规划

ip	hostname
192.168.54.211	master
192.168.54.212	slave1
192.168.54.213	slave2

2、设置 hostname

# 三台主机分别设置
$ hostnamectl set-hostname master
$ hostnamectl set-hostname slave1
$ hostnamectl set-hostname slave2
# 查看当前主机名称
$ hostname

3、设置 ip 和 hostname 的对应关系

# 三台主机分别设置
$ cat >> /etc/hosts << EOF
192.168.54.211 master
192.168.54.212 slave1
192.168.54.213 slave2
EOF

2.2 下载kubespray

# master节点执行
# 下载正式发布的release版本
$ wget https://github.com/kubernetes-sigs/kubespray/archive/v2.16.0.tar.gz
$ tar -zxvf v2.16.0.tar.gz
# 或者直接克隆
$ git clone https://github.com/kubernetes-sigs/kubespray.git -b v2.16.0 --depth=1

2.3 安装依赖

# master节点执行
$ cd kubespray-2.16.0/
$ yum install -y epel-release python3-pip
$ pip3 install -r requirements.txt

如果报错：

# 错误一
Complete output from command python setup.py egg_info:

	=============================DEBUG ASSISTANCE==========================
	If you are seeing an error here please try the following to
	successfully install cryptography:

	Upgrade to the latest pip and try again. This will fix errors for most
	users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
	=============================DEBUG ASSISTANCE==========================

Traceback (most recent call last):
	File "<string>", line 1, in <module>
    File "/tmp/pip-build-3w9d_1bk/cryptography/setup.py", line 17, in <module>
    from setuptools_rust import RustExtension
    ModuleNotFoundError: No module named 'setuptools_rust'

----------------------------------------

# 解决方法
$ pip3 install --upgrade cryptography==3.2

# 错误二
Exception: command 'gcc' failed with exit status 1

# 解决方法
# python2
$ yum install gcc libffi-devel python-devel openssl-devel -y
# python3
$ yum install gcc libffi-devel python3-devel openssl-devel -y

2.4 更新Ansible

查看 Ansible 版本：

[root@master kubespray-2.16.0]# ansible  --version
ansible 2.9.20
  config file = /root/kubespray-2.16.0/ansible.cfg
  configured module search path = ['/root/kubespray-2.16.0/library']
  ansible python module location = /usr/local/lib/python3.6/site-packages/ansible
  executable location = /usr/local/bin/ansible
  python version = 3.6.8 (default, Nov 16 2020, 16:55:22) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]

更新 Ansible inventory file，IPS 地址为 3 个实例的内部 IP：

# master节点执行
[root@master kubespray-2.16.0]# cp -rfp inventory/sample inventory/mycluster
[root@master kubespray-2.16.0]# declare -a IPS=( 192.168.54.211 192.168.54.212 192.168.54.213)
[root@master kubespray-2.16.0]# CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
DEBUG: Adding group all
DEBUG: Adding group kube_control_plane
DEBUG: Adding group kube_node
DEBUG: Adding group etcd
DEBUG: Adding group k8s_cluster
DEBUG: Adding group calico_rr
DEBUG: adding host node1 to group all
DEBUG: adding host node2 to group all
DEBUG: adding host node3 to group all
DEBUG: adding host node1 to group etcd
DEBUG: adding host node2 to group etcd
DEBUG: adding host node3 to group etcd
DEBUG: adding host node1 to group kube_control_plane
DEBUG: adding host node2 to group kube_control_plane
DEBUG: adding host node1 to group kube_node
DEBUG: adding host node2 to group kube_node
DEBUG: adding host node3 to group kube_node

2.5 修改安装节点信息

查看自动生成的 hosts.yaml，kubespray 会根据提供的节点数量自动规划节点角色。这里部署 2 个 master 节

点，同时 3 个节点也作为 node ，3 个节点也用来部署 etcd。

修改 inventory/mycluster/hosts.yaml 文件：

# master节点执行
[root@master kubespray-2.16.0]# vim inventory/mycluster/hosts.yaml
all:
  hosts:
    master:
      ansible_host: 192.168.54.211
      ip: 192.168.54.211
      access_ip: 192.168.54.211
    slave1:
      ansible_host: 192.168.54.212
      ip: 192.168.54.212
      access_ip: 192.168.54.212
    slave2:
      ansible_host: 192.168.54.213
      ip: 192.168.54.213
      access_ip: 192.168.54.213
  children:
    kube-master:
      hosts:
        master:
        slave1:
    kube-node:
      hosts:
        master:
        slave1:
        slave2:
    etcd:
      hosts:
        master:
        slave1:
        slave2:
    k8s-cluster:
      children:
        kter:
        kube-node:
    calico-rr:
      hosts: {}

2.6 修改全局环境变量(默认即可)

[root@master kubespray-2.16.0]# cat inventory/mycluster/group_vars/all/all.yml
---
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd

## Experimental kubeadm etcd deployment mode. Available only for new deployment
etcd_kubeadm_enabled: false

## Directory where the binaries will be installed
bin_dir: /usr/local/bin

## The access_ip variable is used to define how other nodes should access
## the node.  This is used in flannel to allow other flannel nodes to see
## this node for example.  The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
# access_ip: 1.1.1.1


## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
# loadbalancer_apiserver:
#   address: 1.2.3.4
#   port: 1234

## Internal loadbalancers for apiservers
# loadbalancer_apiserver_localhost: true
# valid options are "nginx" or "haproxy"
# loadbalancer_apiserver_type: nginx  # valid values "nginx" or "haproxy"

## If the cilium is going to be used in strict mode, we can use the
## localhost connection and not use the external LB. If this parameter is
## not specified, the first node to connect to kubeapi will be used.
# use_localhost_as_kubeapi_loadbalancer: true

## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443

## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081

### OTHER OPTIONAL VARIABLES

## Upstream dns servers
# upstream_dns_servers:
#   - 8.8.8.8
#   - 8.8.4.4

## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', 'oci', or 'external'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using openstack-client before starting the playbook.
# cloud_provider:

## When cloud_provider is set to 'external', you can set the cloud controller to deploy
## Supported cloud controllers are: 'openstack' and 'vsphere'
## When openstack or vsphere are used make sure to source in the required fields
# external_cloud_provider:

## Set these proxy values in order to update package manager and docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""

## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
# no_proxy: ""

## Some problems may occur when downloading files over https proxy due to ansible bug
## https://github.com/ansible/ansible/issues/32750. Set this variable to False to disable
## SSL validation of get_url module. Note that kubespray will still be performing checksum validation.
# download_validate_certs: False

## If you need exclude all cluster nodes from proxy and other resources, add other resources here.
# additional_no_proxy: ""

## If you need to disable proxying of os package repositories but are still behind an http_proxy set
## skip_http_proxy_on_os_packages to true
## This will cause kubespray not to set proxy environment in /etc/yum.conf for centos and in /etc/apt/apt.conf for debian/ubuntu
## Special information for debian/ubuntu - you have to set the no_proxy variable, then apt package will install from your source of wish
# skip_http_proxy_on_os_packages: false

## Since workers are included in the no_proxy variable by default, docker engine will be restarted on all nodes (all
## pods will restart) when adding or removing workers.  To override this behaviour by only including master nodes in the
## no_proxy variable, set below to true:
no_proxy_exclude_workers: false

## Certificate Management
## This setting determines whether certs are generated via scripts.
## Chose 'none' if you provide your own certificates.
## Option is  "script", "none"
# cert_management: script

## Set to true to allow pre-checks to fail and continue deployment
# ignore_assert_errors: false

## The read-only port for the Kubelet to serve on with no authentication/authorization. Uncomment to enable.
# kube_read_only_port: 10255

## Set true to download and cache container
# download_container: true

## Deploy container engine
# Set false if you want to deploy container engine manually.
# deploy_container_engine: true

## Red Hat Enterprise Linux subscription registration
## Add either RHEL subscription Username/Password or Organization ID/Activation Key combination
## Update RHEL subscription purpose usage, role and SLA if necessary
# rh_subscription_username: ""
# rh_subscription_password: ""
# rh_subscription_org_id: ""
# rh_subscription_activation_key: ""
# rh_subscription_usage: "Development"
# rh_subscription_role: "Red Hat Enterprise Server"
# rh_subscription_sla: "Self-Support"

## Check if access_ip responds to ping. Set false if your firewall blocks ICMP.
# ping_access_ip: true

2.7 修改集群安装配置

默认安装版本较低，指定 kubernetes 版本：

# master节点执行
[root@master kubespray-2.16.0]# vim inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml
## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.20.7

如果有其它需要，修改 inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml 文件即可。

2.8 k8s集群插件

Kuberenetes 仪表板和入口控制器等插件请在下面的文件中进行设置：

$ vim inventory/mycluster/group_vars/k8s_cluster/addons.yml

这里不对该文件进行修改。

2.9 SSH免密配置

配置ssh免密，kubespray ansible 节点对所有节点免密。

# master节点执行
ssh-keygen
ssh-copy-id 192.168.54.211
ssh-copy-id 192.168.54.212
ssh-copy-id 192.168.54.213
ssh-copy-id master
ssh-copy-id slave1
ssh-copy-id slave2

2.10 改变镜像源

# master节点执行
[root@master kubespray-2.16.0]# cat > inventory/mycluster/group_vars/k8s_cluster/vars.yml << EOF
gcr_image_repo: "registry.aliyuncs.com/google_containers"
kube_image_repo: "registry.aliyuncs.com/google_containers"
etcd_download_url: "https://ghproxy.com/https://github.com/coreos/etcd/releases/download/{{ etcd_version }}/etcd-{{ etcd_version }}-linux-{{ image_arch }}.tar.gz"
cni_download_url: "https://ghproxy.com/https://github.com/containernetworking/plugins/releases/download/{{ cni_version }}/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz"
calicoctl_download_url: "https://ghproxy.com/https://github.com/projectcalico/calicoctl/releases/download/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}"
calico_crds_download_url: "https://ghproxy.com/https://github.com/projectcalico/calico/archive/{{ calico_version }}.tar.gz"
crictl_download_url: "https://ghproxy.com/https://github.com/kubernetes-sigs/cri-tools/releases/download/{{ crictl_version }}/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
nodelocaldns_image_repo: "cncamp/k8s-dns-node-cache"
dnsautoscaler_image_repo: "cncamp/cluster-proportional-autoscaler-amd64"
EOF

2.11 安装集群

运行 kubespray playbook 安装集群：

# master节点执行
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml  --become --become-user=root cluster.yml

安装过程中会下载许多可执行文件和镜像。

出现下面的信息表示执行成功：

PLAY RECAP *************************************************************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
master                     : ok=584  changed=109  unreachable=0    failed=0    skipped=1160 rescued=0    ignored=1
slave1                     : ok=520  changed=97   unreachable=0    failed=0    skipped=1008 rescued=0    ignored=0
slave2                     : ok=438  changed=76   unreachable=0    failed=0    skipped=678  rescued=0    ignored=0

Saturday 31 December 2022  20:07:57 +0800 (0:00:00.060)       0:59:12.196 *****
===============================================================================
container-engine/docker : ensure docker packages are installed ----------------------------------------------- 2180.79s
kubernetes/preinstall : Install packages requirements --------------------------------------------------------- 487.24s
download_file | Download item ---------------------------------------------------------------------------------- 58.95s
download_file | Download item ---------------------------------------------------------------------------------- 50.40s
download_container | Download image if required ---------------------------------------------------------------- 44.25s
download_file | Download item ---------------------------------------------------------------------------------- 42.65s
download_container | Download image if required ---------------------------------------------------------------- 38.06s
download_container | Download image if required ---------------------------------------------------------------- 32.38s
kubernetes/kubeadm : Join to cluster --------------------------------------------------------------------------- 32.29s
download_container | Download image if required ---------------------------------------------------------------- 30.67s
download_file | Download item ---------------------------------------------------------------------------------- 25.82s
kubernetes/control-plane : Joining control plane node to the cluster. ------------------------------------------ 25.60s
download_container | Download image if required ---------------------------------------------------------------- 25.34s
download_container | Download image if required ---------------------------------------------------------------- 22.49s
kubernetes/control-plane : kubeadm | Initialize first master --------------------------------------------------- 20.90s
download_container | Download image if required ---------------------------------------------------------------- 20.14s
download_file | Download item ---------------------------------------------------------------------------------- 19.50s
download_container | Download image if required ---------------------------------------------------------------- 17.84s
download_container | Download image if required ---------------------------------------------------------------- 13.96s
download_container | Download image if required ---------------------------------------------------------------- 13.31s

2.12 查看创建的集群

# master节点执行
[root@master ~]# kubectl get nodes -o wide
NAME     STATUS   ROLES                  AGE     VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION           CONTAINER-RUNTIME
master   Ready    control-plane,master   10m     v1.20.7   192.168.54.211   <none>        CentOS Linux 7 (Core)   3.10.0-1160.el7.x86_64   docker://19.3.15
slave1   Ready    control-plane,master   9m38s   v1.20.7   192.168.54.212   <none>        CentOS Linux 7 (Core)   3.10.0-1160.el7.x86_64   docker://19.3.15
slave2   Ready    <none>                 8m40s   v1.20.7   192.168.54.213   <none>        CentOS Linux 7 (Core)   3.10.0-1160.el7.x86_64   docker://19.3.15

[root@master ~]# kubectl -n kube-system get pods
NAME                                       READY   STATUS    RESTARTS   AGE
calico-kube-controllers-7c5b64bf96-wtmxn   1/1     Running   0          8m41s
calico-node-c6rr6                          1/1     Running   0          9m6s
calico-node-l59fj                          1/1     Running   0          9m6s
calico-node-n9tg6                          1/1     Running   0          9m6s
coredns-f944c7f7c-n2wzp                    1/1     Running   0          8m26s
coredns-f944c7f7c-x2tfl                    1/1     Running   0          8m22s
dns-autoscaler-557bfb974d-6cbtk            1/1     Running   0          8m24s
kube-apiserver-master                      1/1     Running   0          10m
kube-apiserver-slave1                      1/1     Running   0          10m
kube-controller-manager-master             1/1     Running   0          10m
kube-controller-manager-slave1             1/1     Running   0          10m
kube-proxy-czk9s                           1/1     Running   0          9m17s
kube-proxy-gwfc8                           1/1     Running   0          9m17s
kube-proxy-tkxlf                           1/1     Running   0          9m17s
kube-scheduler-master                      1/1     Running   0          10m
kube-scheduler-slave1                      1/1     Running   0          10m
nginx-proxy-slave2                         1/1     Running   0          9m18s
nodelocaldns-4vd75                         1/1     Running   0          8m23s
nodelocaldns-cr5gg                         1/1     Running   0          8m23s
nodelocaldns-pmgqx                         1/1     Running   0          8m23s

2.13 查看安装的镜像

# master节点执行
[root@master ~]# docker images
REPOSITORY                                                        TAG                 IMAGE ID            CREATED             SIZE
registry.aliyuncs.com/google_containers/kube-proxy                v1.20.7             ff54c88b8ecf        19 months ago       118MB
registry.aliyuncs.com/google_containers/kube-controller-manager   v1.20.7             22d1a2072ec7        19 months ago       116MB
registry.aliyuncs.com/google_containers/kube-apiserver            v1.20.7             034671b24f0f        19 months ago       122MB
registry.aliyuncs.com/google_containers/kube-scheduler            v1.20.7             38f903b54010        19 months ago       47.3MB
nginx                                                             1.19                f0b8a9a54136        19 months ago       133MB
quay.io/calico/node                                               v3.17.4             4d9399da41dc        20 months ago       165MB
quay.io/calico/cni                                                v3.17.4             f3abd83bc819        20 months ago       128MB
quay.io/calico/kube-controllers                                   v3.17.4             c623a89d3672        20 months ago       52.2MB
cncamp/k8s-dns-node-cache                                         1.17.1              21fc69048bd5        22 months ago       123MB
quay.io/coreos/etcd                                               v3.4.13             d1985d404385        2 years ago         83.8MB
cncamp/cluster-proportional-autoscaler-amd64                      1.8.3               078b6f04135f        2 years ago         40.6MB
registry.aliyuncs.com/google_containers/coredns                   1.7.0               bfe3a36ebd25        2 years ago         45.2MB
registry.aliyuncs.com/google_containers/pause                     3.3                 0184c1613d92        2 years ago         683kB
registry.aliyuncs.com/google_containers/pause                     3.2                 80d28bedfe5d        2 years ago         683kB

# slave1节点执行
[root@slave1 ~]# docker images
REPOSITORY                                                        TAG                 IMAGE ID            CREATED             SIZE
registry.aliyuncs.com/google_containers/kube-proxy                v1.20.7             ff54c88b8ecf        19 months ago       118MB
registry.aliyuncs.com/google_containers/kube-apiserver            v1.20.7             034671b24f0f        19 months ago       122MB
registry.aliyuncs.com/google_containers/kube-controller-manager   v1.20.7             22d1a2072ec7        19 months ago       116MB
registry.aliyuncs.com/google_containers/kube-scheduler            v1.20.7             38f903b54010        19 months ago       47.3MB
nginx                                                             1.19                f0b8a9a54136        19 months ago       133MB
quay.io/calico/node                                               v3.17.4             4d9399da41dc        20 months ago       165MB
quay.io/calico/cni                                                v3.17.4             f3abd83bc819        20 months ago       128MB
quay.io/calico/kube-controllers                                   v3.17.4             c623a89d3672        20 months ago       52.2MB
cncamp/k8s-dns-node-cache                                         1.17.1              21fc69048bd5        22 months ago       123MB
quay.io/coreos/etcd                                               v3.4.13             d1985d404385        2 years ago         83.8MB
cncamp/cluster-proportional-autoscaler-amd64                      1.8.3               078b6f04135f        2 years ago         40.6MB
registry.aliyuncs.com/google_containers/coredns                   1.7.0               bfe3a36ebd25        2 years ago         45.2MB
registry.aliyuncs.com/google_containers/pause                     3.3                 0184c1613d92        2 years ago         683kB
registry.aliyuncs.com/google_containers/pause                     3.2                 80d28bedfe5d        2 years ago         683kB

#  slave2节点执行
[root@slave2 ~]# docker images
REPOSITORY                                                        TAG                 IMAGE ID            CREATED             SIZE
registry.aliyuncs.com/google_containers/kube-proxy                v1.20.7             ff54c88b8ecf        19 months ago       118MB
registry.aliyuncs.com/google_containers/kube-apiserver            v1.20.7             034671b24f0f        19 months ago       122MB
registry.aliyuncs.com/google_containers/kube-controller-manager   v1.20.7             22d1a2072ec7        19 months ago       116MB
registry.aliyuncs.com/google_containers/kube-scheduler            v1.20.7             38f903b54010        19 months ago       47.3MB
nginx                                                             1.19                f0b8a9a54136        19 months ago       133MB
quay.io/calico/node                                               v3.17.4             4d9399da41dc        20 months ago       165MB
quay.io/calico/cni                                                v3.17.4             f3abd83bc819        20 months ago       128MB
quay.io/calico/kube-controllers                                   v3.17.4             c623a89d3672        20 months ago       52.2MB
cncamp/k8s-dns-node-cache                                         1.17.1              21fc69048bd5        22 months ago       123MB
quay.io/coreos/etcd                                               v3.4.13             d1985d404385        2 years ago         83.8MB
registry.aliyuncs.com/google_containers/pause                     3.3                 0184c1613d92        2 years ago         683kB

导出镜像供离线使用：

#  master节点执行
docker save -o kube-proxy.tar registry.aliyuncs.com/google_containers/kube-proxy:v1.20.7
docker save -o kube-controller-manager.tar registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.7
docker save -o kube-apiserver.tar registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.7
docker save -o kube-scheduler.tar registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.7
docker save -o nginx.tar nginx:1.19
docker save -o node.tar quay.io/calico/node:v3.17.4
docker save -o cni.tar quay.io/calico/cni:v3.17.4
docker save -o kube-controllers.tar quay.io/calico/kube-controllers:v3.17.4
docker save -o k8s-dns-node-cache.tar cncamp/k8s-dns-node-cache:1.17.1
docker save -o etcd.tar quay.io/coreos/etcd:v3.4.13
docker save -o cluster-proportional-autoscaler-amd64.tar cncamp/cluster-proportional-autoscaler-amd64:1.8.3
docker save -o coredns.tar registry.aliyuncs.com/google_containers/coredns:1.7.0
docker save -o pause_3.3.tar registry.aliyuncs.com/google_containers/pause:3.3
docker save -o pause_3.2.tar registry.aliyuncs.com/google_containers/pause:3.2

查看生成的文件：

# master节点执行
[root@master ~]# tree Kubespray-2.16.0/
Kubespray-2.16.0/
├── calicoctl
├── cni-plugins-linux-amd64-v0.9.1.tgz
├── images
│   ├── cluster-proportional-autoscaler-amd64.tar
│   ├── cni.tar
│   ├── coredns.tar
│   ├── etcd.tar
│   ├── k8s-dns-node-cache.tar
│   ├── kube-apiserver.tar
│   ├── kube-controller-manager.tar
│   ├── kube-controllers.tar
│   ├── kube-proxy.tar
│   ├── kube-scheduler.tar
│   ├── nginx.tar
│   ├── node.tar
│   ├── pause_3.2.tar
│   └── pause_3.3.tar
├── kubeadm-v1.20.7-amd64
├── kubectl-v1.20.7-amd64
├── kubelet-v1.20.7-amd64
└── rpm
    ├── docker
    │   ├── audit-libs-python-2.8.5-4.el7.x86_64.rpm
    │   ├── b001-libsemanage-python-2.5-14.el7.x86_64.rpm
    │   ├── b002-setools-libs-3.3.8-4.el7.x86_64.rpm
    │   ├── b003-libcgroup-0.41-21.el7.x86_64.rpm
    │   ├── b0041-checkpolicy-2.5-8.el7.x86_64.rpm
    │   ├── b004-python-IPy-0.75-6.el7.noarch.rpm
    │   ├── b005-policycoreutils-python-2.5-34.el7.x86_64.rpm
    │   ├── b006-container-selinux-2.119.2-1.911c772.el7_8.noarch.rpm
    │   ├── b007-containerd.io-1.3.9-3.1.el7.x86_64.rpm
    │   ├── d001-docker-ce-cli-19.03.14-3.el7.x86_64.rpm
    │   ├── d002-docker-ce-19.03.14-3.el7.x86_64.rpm
    │   └── d003-libseccomp-2.3.1-4.el7.x86_64.rpm
    └── preinstall
        ├── a001-libseccomp-2.3.1-4.el7.x86_64.rpm
        ├── bash-completion-2.1-8.el7.noarch.rpm
        ├── chrony-3.4-1.el7.x86_64.rpm
        ├── e2fsprogs-1.42.9-19.el7.x86_64.rpm
        ├── ebtables-2.0.10-16.el7.x86_64.rpm
        ├── ipset-7.1-1.el7.x86_64.rpm
        ├── ipvsadm-1.27-8.el7.x86_64.rpm
        ├── rsync-3.1.2-10.el7.x86_64.rpm
        ├── socat-1.7.3.2-2.el7.x86_64.rpm
        ├── unzip-6.0-22.el7_9.x86_64.rpm
        ├── wget-1.14-18.el7_6.1.x86_64.rpm
        └── xfsprogs-4.5.0-22.el7.x86_64.rpm

4 directories, 43 files

2.14 卸载集群

卸载集群：

[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml  --become --become-user=root reset.yml

2.15 添加节点

1、在 inventory/mycluster/hosts.yaml 中添加新增节点信息

2、执行下面的命令：

[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root scale.yml -v -b --private-key=~/.ssh/id_rsa

2.16 移除节点

不用修改 hosts.yaml 文件，而是直接执行下面的命令：

[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root remove-node.yml -v -b --extra-vars "node=slave1"

2.17 升级集群

[root@master kubespray-2.16.0]# ansible-playbook upgrade-cluster.yml -b -i inventory/mycluster/hosts.yaml -e kube_version=v1.22.0

3、离线部署

在线部署可能因为网络的原因导致部署失败，所以可以使用离线部署 k8s 集群。

下面是从网上看到的一个离线部署的例子。

kubespray GitHub 地址为： https://github.com/kubernetes-sigs/kubespray

这里使用分支为 release-2.15，对应的主要组件和系统版本如下：

kubernetes v1.19.10
docker v19.03
calico v3.16.9
centos 7.9.2009

kubespray 离线包下载地址：

https://www.mediafire.com/file/nyifoimng9i6zp5/kubespray_offline.tar.gz/file

离线包下载完成后解压到 /opt 目录下：

# master节点执行
[root@master opt]# tar -zxvf /opt/kubespray_offline.tar.gz -C /opt/

查看文件列表：

# master节点执行
[root@master opt]# ll /opt/kubespray_offline
总用量 4
drwxr-xr-x.  4 root root   28 7月  11 2021 ansible_install
drwxr-xr-x. 15 root root 4096 7月   8 2021 kubespray
drwxr-xr-x.  4 root root  240 7月   9 2021 kubespray_cache

三台机器的IP地址为：192.168.43.211、192.168.43.212 和 192.168.43.213。

开始部署 ansible 服务器：

# master节点执行
[root@master opt]# yum install /opt/kubespray_offline/ansible_install/rpm/*
[root@master opt]# pip3 install /opt/kubespray_offline/ansible_install/pip/*

配置主机免密码登陆：

# master节点执行
[root@master ~]# ssh-keygen

[root@master ~]# ssh-copy-id 192.168.43.211

[root@master ~]# ssh-copy-id 192.168.43.212

[root@master ~]# ssh-copy-id 192.168.43.213

[root@master ~]# ssh-copy-id master

[root@master ~]# ssh-copy-id slave1

[root@master ~]# ssh-copy-id slave2

配置 ansible 主机组：

# master节点执行
[root@master opt]# cd /opt/kubespray_offline/kubespray
[root@master kubespray]# declare -a IPS=(192.168.43.211 192.168.43.212 192.168.43.213)
[root@master kubespray]# CONFIG_FILE=inventory/mycluster/hosts.yaml python3.6 contrib/inventory_builder/inventory.py ${IPS[@]}
DEBUG: Adding group all
DEBUG: Adding group kube-master
DEBUG: Adding group kube-node
DEBUG: Adding group etcd
DEBUG: Adding group k8s-cluster
DEBUG: Adding group calico-rr
DEBUG: adding host node1 to group all
DEBUG: adding host node2 to group all
DEBUG: adding host node3 to group all
DEBUG: adding host node1 to group etcd
DEBUG: adding host node2 to group etcd
DEBUG: adding host node3 to group etcd
DEBUG: adding host node1 to group kube-master
DEBUG: adding host node2 to group kube-master
DEBUG: adding host node1 to group kube-node
DEBUG: adding host node2 to group kube-node
DEBUG: adding host node3 to group kube-node

inventory/mycluster/hosts.yaml 文件会自动生成。

修改 inventory/mycluster/hosts.yaml 文件：

# master节点执行
[root@master kubespray]# vim inventory/mycluster/hosts.yaml
all:
  hosts:
    master:
      ansible_host: 192.168.43.211
      ip: 192.168.43.211
      access_ip: 192.168.43.211
    slave1:
      ansible_host: 192.168.43.212
      ip: 192.168.43.212
      access_ip: 192.168.43.212
    slave2:
      ansible_host: 192.168.43.213
      ip: 192.168.43.213
      access_ip: 192.168.43.213
  children:
    kube-master:
      hosts:
        master:
        slave1:
    kube-node:
      hosts:
        master:
        slave1:
        slave2:
    etcd:
      hosts:
        master:
        slave1:
        slave2:
    k8s-cluster:
      children:
        kter:
        kube-node:
    calico-rr:
      hosts: {}

修改配置文件使用离线的安装包和镜像：

# master节点执行
[root@master kubespray]# vim inventory/mycluster/group_vars/all/all.yml
---
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd

## Experimental kubeadm etcd deployment mode. Available only for new deployment
etcd_kubeadm_enabled: false

## Directory where the binaries will be installed
bin_dir: /usr/local/bin

## The access_ip variable is used to define how other nodes should access
## the node.  This is used in flannel to allow other flannel nodes to see
## this node for example.  The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
# access_ip: 1.1.1.1


## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
# loadbalancer_apiserver:
#   address: 1.2.3.4
#   port: 1234

## Internal loadbalancers for apiservers
# loadbalancer_apiserver_localhost: true
# valid options are "nginx" or "haproxy"
# loadbalancer_apiserver_type: nginx  # valid values "nginx" or "haproxy"

## If the cilium is going to be used in strict mode, we can use the
## localhost connection and not use the external LB. If this parameter is
## not specified, the first node to connect to kubeapi will be used.
# use_localhost_as_kubeapi_loadbalancer: true

## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443

## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081

### OTHER OPTIONAL VARIABLES

## Upstream dns servers
# upstream_dns_servers:
#   - 8.8.8.8
#   - 8.8.4.4

## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', 'oci', or 'external'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using openstack-client before starting the playbook.
# cloud_provider:

## When cloud_provider is set to 'external', you can set the cloud controller to deploy
## Supported cloud controllers are: 'openstack' and 'vsphere'
## When openstack or vsphere are used make sure to source in the required fields
# external_cloud_provider:

## Set these proxy values in order to update package manager and docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""
# 
## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
# no_proxy: ""

## Some problems may occur when downloading files over https proxy due to ansible bug
## https://github.com/ansible/ansible/issues/32750. Set this variable to False to disable
## SSL validation of get_url module. Note that kubespray will still be performing checksum validation.
# download_validate_certs: False

## If you need exclude all cluster nodes from proxy and other resources, add other resources here.
# additional_no_proxy: ""

## If you need to disable proxying of os package repositories but are still behind an http_proxy set
## skip_http_proxy_on_os_packages to true
## This will cause kubespray not to set proxy environment in /etc/yum.conf for centos and in /etc/apt/apt.conf for debian/ubuntu
## Special information for debian/ubuntu - you have to set the no_proxy variable, then apt package will install from your source of wish
# skip_http_proxy_on_os_packages: false

## Since workers are included in the no_proxy variable by default, docker engine will be restarted on all nodes (all
## pods will restart) when adding or removing workers.  To override this behaviour by only including master nodes in the
## no_proxy variable, set below to true:
no_proxy_exclude_workers: false

## Certificate Management
## This setting determines whether certs are generated via scripts.
## Chose 'none' if you provide your own certificates.
## Option is  "script", "none"
## note: vault is removed
# cert_management: script

## Set to true to allow pre-checks to fail and continue deployment
# ignore_assert_errors: false

## The read-only port for the Kubelet to serve on with no authentication/authorization. Uncomment to enable.
# kube_read_only_port: 10255

## Set true to download and cache container
# download_container: true

## Deploy container engine
# Set false if you want to deploy container engine manually.
# deploy_container_engine: true

## Red Hat Enterprise Linux subscription registration
## Add either RHEL subscription Username/Password or Organization ID/Activation Key combination
## Update RHEL subscription purpose usage, role and SLA if necessary
# rh_subscription_username: ""
# rh_subscription_password: ""
# rh_subscription_org_id: ""
# rh_subscription_activation_key: ""
# rh_subscription_usage: "Development"
# rh_subscription_role: "Red Hat Enterprise Server"
# rh_subscription_sla: "Self-Support"

## Check if access_ip responds to ping. Set false if your firewall blocks ICMP.
# ping_access_ip: true
kube_apiserver_node_port_range: "1-65535"
kube_apiserver_node_port_range_sysctl: false


download_run_once: true
download_localhost: true
download_force_cache: true
download_cache_dir: /opt/kubespray_offline/kubespray_cache # 修改

preinstall_cache_rpm: true
docker_cache_rpm: true
download_rpm_localhost: "{{ download_cache_dir }}/rpm" # 修改
tmp_cache_dir: /tmp/k8s_cache # 修改
tmp_preinstall_rpm: "{{ tmp_cache_dir }}/rpm/preinstall" # 修改
tmp_docker_rpm: "{{ tmp_cache_dir }}/rpm/docker" # 修改
image_is_cached: true
nodelocaldns_dire_coredns: true

开始部署 k8s：

# master节点执行
[root@master kubespray]# ansible-playbook -i inventory/mycluster/hosts.yaml  --become --become-user=root cluster.yml
......

PLAY RECAP ****************************************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
master                     : ok=762  changed=178  unreachable=0    failed=0    skipped=1060 rescued=0    ignored=1
slave1                     : ok=648  changed=159  unreachable=0    failed=0    skipped=918  rescued=0    ignored=0
slave2                     : ok=462  changed=104  unreachable=0    failed=0    skipped=584  rescued=0    ignored=0

Sunday 18 June 2023  12:36:42 +0800 (0:00:00.059)       0:13:16.912 ***********
===============================================================================
kubernetes/master : kubeadm | Initialize first master ------------------------------------ 136.25s
kubernetes/master : Joining control plane node to the cluster. --------------------------- 110.63s
kubernetes/kubeadm : Join to cluster ------------------------------------------------------ 37.66s
container-engine/docker : Install packages docker with local rpm|Install RPM -------------- 29.70s
download_container | Load image into docker ----------------------------------------------- 11.72s
reload etcd ------------------------------------------------------------------------------- 10.62s
Gen_certs | Write etcd master certs -------------------------------------------------------- 9.13s
Gen_certs | Write etcd master certs -------------------------------------------------------- 8.84s
kubernetes/master : Master | wait for kube-scheduler --------------------------------------- 8.03s
download_container | Load image into docker ------------------------------------------------ 7.07s
download_container | Upload image to node if it is cached ---------------------------------- 6.72s
download_container | Load image into docker ------------------------------------------------ 6.69s
download_container | Load image into docker ------------------------------------------------ 6.36s
kubernetes/preinstall : Install packages requirements with local rpm|Install RPM ----------- 6.00s
wait for etcd up --------------------------------------------------------------------------- 5.76s
download_file | Copy file from cache to nodes, if it is available -------------------------- 5.64s
download_container | Load image into docker ------------------------------------------------ 5.57s
network_plugin/calico : Wait for calico kubeconfig to be created --------------------------- 5.37s
Configure | Check if etcd cluster is healthy ----------------------------------------------- 5.25s
kubernetes-apps/ansible : Kubernetes Apps | Start Resources -------------------------------- 5.16s

部署时间大概持续半个小时，中间不需要任何介入，部署完成后，查看集群和Pod状态：

# master节点执行
[root@master kubespray]# kubectl get nodes -o wide
NAME     STATUS   ROLES    AGE     VERSION    INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION           CONTAINER-RUNTIME
master   Ready    master   5m46s   v1.19.10   192.168.43.211   <none>        CentOS Linux 7 (Core)   3.10.0-1160.el7.x86_64   docker://19.3.14
slave1   Ready    master   3m50s   v1.19.10   192.168.43.212   <none>        CentOS Linux 7 (Core)   3.10.0-1160.el7.x86_64   docker://19.3.14
slave2   Ready    <none>   2m49s   v1.19.10   192.168.43.213   <none>        CentOS Linux 7 (Core)   3.10.0-1160.el7.x86_64   docker://19.3.15

[root@master kubespray]# kubectl get pods -n kube-system
NAME                                          READY   STATUS    RESTARTS   AGE
calico-kube-controllers-7fbf9b4bbb-nw7j5      1/1     Running   0          4m24s
calico-node-8bhct                             1/1     Running   0          4m45s
calico-node-rbkls                             1/1     Running   0          4m45s
calico-node-svphr                             1/1     Running   0          4m45s
coredns-7677f9bb54-j8xx5                      1/1     Running   0          3m58s
coredns-7677f9bb54-tzzpp                      1/1     Running   0          4m2s
dns-autoscaler-5b7b5c9b6f-mx9dv               1/1     Running   0          4m
k8dash-77959656b-vsqfq                        1/1     Running   0          3m55s
kube-apiserver-master                         1/1     Running   0          7m47s
kube-apiserver-slave1                         1/1     Running   0          5m58s
kube-controller-manager-master                1/1     Running   0          7m47s
kube-controller-manager-slave1                1/1     Running   0          5m58s
kube-proxy-ktvmd                              1/1     Running   0          4m56s
kube-proxy-rcnhc                              1/1     Running   0          4m56s
kube-proxy-slc7z                              1/1     Running   0          4m56s
kube-scheduler-master                         1/1     Running   0          7m47s
kube-scheduler-slave1                         1/1     Running   0          5m58s
kubernetes-dashboard-758979f44b-xfw8x         1/1     Running   0          3m57s
kubernetes-metrics-scraper-678c97765c-k7z5c   1/1     Running   0          3m56s
metrics-server-8676bf5f99-nkrjr               1/1     Running   0          3m39s
nginx-proxy-slave2                            1/1     Running   0          4m57s
nodelocaldns-bxww2                            1/1     Running   0          3m58s
nodelocaldns-j2hvc                            1/1     Running   0          3m58s
nodelocaldns-p2nx8                            1/1     Running   0          3m58s

验证集群：

# master节点执行
# nginx安装
# 创建一个nginx镜像
[root@master ~]# kubectl create deployment nginx --image=nginx
deployment.apps/nginx created

# master节点执行
# 设置对外暴露端口
[root@master ~]# kubectl expose deployment nginx --port=80 --type=NodePort
service/nginx exposed

# master节点执行
[root@master ~]# kubectl get pods,svc
NAME                         READY   STATUS    RESTARTS   AGE
pod/nginx-6799fc88d8-4t4mv   1/1     Running   0          72s

NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/kubernetes   ClusterIP   10.233.0.1      <none>        443/TCP        11m
service/nginx        NodePort    10.233.31.130   <none>        80:22013/TCP   53s

# master节点执行
# 发送curl请求
[root@master ~]# curl http://192.168.43.211:22013/
[root@master ~]# curl http://192.168.43.212:22013/
[root@master ~]# curl http://192.168.43.213:22013/
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

说明集群没有问题。

卸载集群：

# master节点执行
[root@master kubespray]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root reset.yml
......
PLAY RECAP ****************************************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
master                     : ok=31   changed=17   unreachable=0    failed=0    skipped=24   rescued=0    ignored=0
slave1                     : ok=30   changed=17   unreachable=0    failed=0    skipped=19   rescued=0    ignored=0
slave2                     : ok=30   changed=17   unreachable=0    failed=0    skipped=19   rescued=0    ignored=0

Sunday 18 June 2023  12:46:14 +0800 (0:00:01.135)       0:00:49.596 ***********
===============================================================================
Gather necessary facts (hardware) --------------------------------------------------------- 21.52s
reset | delete some files and directories ------------------------------------------------- 10.41s
reset | unmount kubelet dirs --------------------------------------------------------------- 1.73s
reset | remove all containers -------------------------------------------------------------- 1.63s
reset | remove services -------------------------------------------------------------------- 1.63s
reset | Restart network -------------------------------------------------------------------- 1.14s
download | Download files / images --------------------------------------------------------- 1.02s
reset : flush iptables --------------------------------------------------------------------- 0.89s
reset | stop services ---------------------------------------------------------------------- 0.80s
reset | restart docker if needed ----------------------------------------------------------- 0.77s
reset | remove docker dropins -------------------------------------------------------------- 0.76s
reset | remove remaining routes set by bird ------------------------------------------------ 0.57s
reset | stop etcd services ----------------------------------------------------------------- 0.53s
Gather minimal facts ----------------------------------------------------------------------- 0.48s
Gather necessary facts (network) ----------------------------------------------------------- 0.46s
reset | remove dns settings from dhclient.conf --------------------------------------------- 0.44s
reset | remove etcd services --------------------------------------------------------------- 0.41s
reset | systemctl daemon-reload ------------------------------------------------------------ 0.41s
reset | check if crictl is present --------------------------------------------------------- 0.30s
reset | Remove kube-ipvs0 ------------------------------------------------------------------ 0.25s

至此，离线的 k8s 集群搭建完毕。

广州城市开发者社区

欢迎加入我们的广州开发者社区，与优秀的开发者共同成长！

更多推荐

用于电子制造自动化的批量扫描微型数据矩阵码

广州城市开发者社区

Elasticsearch - 如何查看 Elasticsearch 是否正在运行

广州城市开发者社区

基于Python的中文内容原创生成系统开发技术研究

随着自然语言处理技术的快速发展，构建高质量的中文内容原创生成系统已成为人工智能领域的研究热点。本文围绕Python技术生态，系统探讨在中文文本生成领域中的核心技术架构、算法实现及优化方向，为开发者提供技术实现路径的参考。中文内容生成系统的技术实现依赖于Python的生态系统，尤其是其在机器学习框架、中文处理库、数据处理库等方面的成熟积累。构建有效的训练语料库需要突破三个技术难点：标点符号体系的非严