kubespray部署kubernetes集群
kubespray部署kubernetes集群
kubespray部署kubernetes集群
1、kubespray简介
Kubespray 是开源的部署生产级别 Kubernetes 集群的项目,它整合了 Ansible 作为部署的工具。
-
可以部署在 AWS,GCE,Azure,OpenStack,vSphere,Packet(Bare metal),
Oracle Cloud Infrastructure(Experimental)或Baremetal上。
-
高可用集群
-
可组合各种组件(例如,选择网络插件)
-
支持最受欢迎的Linux发行版
-
持续集成测试
官网:https://kubespray.io
项目地址:https://github.com/kubernetes-sigs/kubespray
2、在线部署
国内特殊的网络环境导致使用 kubespray 特别困难,部分镜像需要从 gcr.io 拉取,部分二进制文件需要从
github 下载,所以可以提前下载好进行镜像导入。
说明:高可用部署 etcd 要求3个节点,所以高可用集群最少需要 3 个节点。
kubespray 需要一个部署节点,也可以复用集群任意一个节点,这里在第一个master节点( 192.168.54.211 )安装
kubespray,并执行后续的所有操作。
2.1 搭建环境准备
1、服务器规划
| ip | hostname |
|---|---|
| 192.168.54.211 | master |
| 192.168.54.212 | slave1 |
| 192.168.54.213 | slave2 |
2、设置 hostname
# 三台主机分别设置
$ hostnamectl set-hostname master
$ hostnamectl set-hostname slave1
$ hostnamectl set-hostname slave2
# 查看当前主机名称
$ hostname
3、设置 ip 和 hostname 的对应关系
# 三台主机分别设置
$ cat >> /etc/hosts << EOF
192.168.54.211 master
192.168.54.212 slave1
192.168.54.213 slave2
EOF
2.2 下载kubespray
# master节点执行
# 下载正式发布的release版本
$ wget https://github.com/kubernetes-sigs/kubespray/archive/v2.16.0.tar.gz
$ tar -zxvf v2.16.0.tar.gz
# 或者直接克隆
$ git clone https://github.com/kubernetes-sigs/kubespray.git -b v2.16.0 --depth=1
2.3 安装依赖
# master节点执行
$ cd kubespray-2.16.0/
$ yum install -y epel-release python3-pip
$ pip3 install -r requirements.txt
如果报错:
# 错误一
Complete output from command python setup.py egg_info:
=============================DEBUG ASSISTANCE==========================
If you are seeing an error here please try the following to
successfully install cryptography:
Upgrade to the latest pip and try again. This will fix errors for most
users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
=============================DEBUG ASSISTANCE==========================
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-3w9d_1bk/cryptography/setup.py", line 17, in <module>
from setuptools_rust import RustExtension
ModuleNotFoundError: No module named 'setuptools_rust'
----------------------------------------
# 解决方法
$ pip3 install --upgrade cryptography==3.2
# 错误二
Exception: command 'gcc' failed with exit status 1
# 解决方法
# python2
$ yum install gcc libffi-devel python-devel openssl-devel -y
# python3
$ yum install gcc libffi-devel python3-devel openssl-devel -y
2.4 更新Ansible
查看 Ansible 版本:
[root@master kubespray-2.16.0]# ansible --version
ansible 2.9.20
config file = /root/kubespray-2.16.0/ansible.cfg
configured module search path = ['/root/kubespray-2.16.0/library']
ansible python module location = /usr/local/lib/python3.6/site-packages/ansible
executable location = /usr/local/bin/ansible
python version = 3.6.8 (default, Nov 16 2020, 16:55:22) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
更新 Ansible inventory file,IPS 地址为 3 个实例的内部 IP:
# master节点执行
[root@master kubespray-2.16.0]# cp -rfp inventory/sample inventory/mycluster
[root@master kubespray-2.16.0]# declare -a IPS=( 192.168.54.211 192.168.54.212 192.168.54.213)
[root@master kubespray-2.16.0]# CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
DEBUG: Adding group all
DEBUG: Adding group kube_control_plane
DEBUG: Adding group kube_node
DEBUG: Adding group etcd
DEBUG: Adding group k8s_cluster
DEBUG: Adding group calico_rr
DEBUG: adding host node1 to group all
DEBUG: adding host node2 to group all
DEBUG: adding host node3 to group all
DEBUG: adding host node1 to group etcd
DEBUG: adding host node2 to group etcd
DEBUG: adding host node3 to group etcd
DEBUG: adding host node1 to group kube_control_plane
DEBUG: adding host node2 to group kube_control_plane
DEBUG: adding host node1 to group kube_node
DEBUG: adding host node2 to group kube_node
DEBUG: adding host node3 to group kube_node
2.5 修改安装节点信息
查看自动生成的 hosts.yaml,kubespray 会根据提供的节点数量自动规划节点角色。这里部署 2 个 master 节
点,同时 3 个节点也作为 node ,3 个节点也用来部署 etcd。
修改 inventory/mycluster/hosts.yaml 文件:
# master节点执行
[root@master kubespray-2.16.0]# vim inventory/mycluster/hosts.yaml
all:
hosts:
master:
ansible_host: 192.168.54.211
ip: 192.168.54.211
access_ip: 192.168.54.211
slave1:
ansible_host: 192.168.54.212
ip: 192.168.54.212
access_ip: 192.168.54.212
slave2:
ansible_host: 192.168.54.213
ip: 192.168.54.213
access_ip: 192.168.54.213
children:
kube-master:
hosts:
master:
slave1:
kube-node:
hosts:
master:
slave1:
slave2:
etcd:
hosts:
master:
slave1:
slave2:
k8s-cluster:
children:
kter:
kube-node:
calico-rr:
hosts: {}
2.6 修改全局环境变量(默认即可)
[root@master kubespray-2.16.0]# cat inventory/mycluster/group_vars/all/all.yml
---
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd
## Experimental kubeadm etcd deployment mode. Available only for new deployment
etcd_kubeadm_enabled: false
## Directory where the binaries will be installed
bin_dir: /usr/local/bin
## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
# access_ip: 1.1.1.1
## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
# loadbalancer_apiserver:
# address: 1.2.3.4
# port: 1234
## Internal loadbalancers for apiservers
# loadbalancer_apiserver_localhost: true
# valid options are "nginx" or "haproxy"
# loadbalancer_apiserver_type: nginx # valid values "nginx" or "haproxy"
## If the cilium is going to be used in strict mode, we can use the
## localhost connection and not use the external LB. If this parameter is
## not specified, the first node to connect to kubeapi will be used.
# use_localhost_as_kubeapi_loadbalancer: true
## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443
## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081
### OTHER OPTIONAL VARIABLES
## Upstream dns servers
# upstream_dns_servers:
# - 8.8.8.8
# - 8.8.4.4
## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', 'oci', or 'external'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using openstack-client before starting the playbook.
# cloud_provider:
## When cloud_provider is set to 'external', you can set the cloud controller to deploy
## Supported cloud controllers are: 'openstack' and 'vsphere'
## When openstack or vsphere are used make sure to source in the required fields
# external_cloud_provider:
## Set these proxy values in order to update package manager and docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""
## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
# no_proxy: ""
## Some problems may occur when downloading files over https proxy due to ansible bug
## https://github.com/ansible/ansible/issues/32750. Set this variable to False to disable
## SSL validation of get_url module. Note that kubespray will still be performing checksum validation.
# download_validate_certs: False
## If you need exclude all cluster nodes from proxy and other resources, add other resources here.
# additional_no_proxy: ""
## If you need to disable proxying of os package repositories but are still behind an http_proxy set
## skip_http_proxy_on_os_packages to true
## This will cause kubespray not to set proxy environment in /etc/yum.conf for centos and in /etc/apt/apt.conf for debian/ubuntu
## Special information for debian/ubuntu - you have to set the no_proxy variable, then apt package will install from your source of wish
# skip_http_proxy_on_os_packages: false
## Since workers are included in the no_proxy variable by default, docker engine will be restarted on all nodes (all
## pods will restart) when adding or removing workers. To override this behaviour by only including master nodes in the
## no_proxy variable, set below to true:
no_proxy_exclude_workers: false
## Certificate Management
## This setting determines whether certs are generated via scripts.
## Chose 'none' if you provide your own certificates.
## Option is "script", "none"
# cert_management: script
## Set to true to allow pre-checks to fail and continue deployment
# ignore_assert_errors: false
## The read-only port for the Kubelet to serve on with no authentication/authorization. Uncomment to enable.
# kube_read_only_port: 10255
## Set true to download and cache container
# download_container: true
## Deploy container engine
# Set false if you want to deploy container engine manually.
# deploy_container_engine: true
## Red Hat Enterprise Linux subscription registration
## Add either RHEL subscription Username/Password or Organization ID/Activation Key combination
## Update RHEL subscription purpose usage, role and SLA if necessary
# rh_subscription_username: ""
# rh_subscription_password: ""
# rh_subscription_org_id: ""
# rh_subscription_activation_key: ""
# rh_subscription_usage: "Development"
# rh_subscription_role: "Red Hat Enterprise Server"
# rh_subscription_sla: "Self-Support"
## Check if access_ip responds to ping. Set false if your firewall blocks ICMP.
# ping_access_ip: true
2.7 修改集群安装配置
默认安装版本较低,指定 kubernetes 版本:
# master节点执行
[root@master kubespray-2.16.0]# vim inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml
## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.20.7
如果有其它需要,修改 inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml 文件即可。
2.8 k8s集群插件
Kuberenetes 仪表板和入口控制器等插件请在下面的文件中进行设置:
$ vim inventory/mycluster/group_vars/k8s_cluster/addons.yml
这里不对该文件进行修改。
2.9 SSH免密配置
配置ssh免密,kubespray ansible 节点对所有节点免密。
# master节点执行
ssh-keygen
ssh-copy-id 192.168.54.211
ssh-copy-id 192.168.54.212
ssh-copy-id 192.168.54.213
ssh-copy-id master
ssh-copy-id slave1
ssh-copy-id slave2
2.10 改变镜像源
# master节点执行
[root@master kubespray-2.16.0]# cat > inventory/mycluster/group_vars/k8s_cluster/vars.yml << EOF
gcr_image_repo: "registry.aliyuncs.com/google_containers"
kube_image_repo: "registry.aliyuncs.com/google_containers"
etcd_download_url: "https://ghproxy.com/https://github.com/coreos/etcd/releases/download/{{ etcd_version }}/etcd-{{ etcd_version }}-linux-{{ image_arch }}.tar.gz"
cni_download_url: "https://ghproxy.com/https://github.com/containernetworking/plugins/releases/download/{{ cni_version }}/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz"
calicoctl_download_url: "https://ghproxy.com/https://github.com/projectcalico/calicoctl/releases/download/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}"
calico_crds_download_url: "https://ghproxy.com/https://github.com/projectcalico/calico/archive/{{ calico_version }}.tar.gz"
crictl_download_url: "https://ghproxy.com/https://github.com/kubernetes-sigs/cri-tools/releases/download/{{ crictl_version }}/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
nodelocaldns_image_repo: "cncamp/k8s-dns-node-cache"
dnsautoscaler_image_repo: "cncamp/cluster-proportional-autoscaler-amd64"
EOF
2.11 安装集群
运行 kubespray playbook 安装集群:
# master节点执行
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
安装过程中会下载许多可执行文件和镜像。
出现下面的信息表示执行成功:
PLAY RECAP *************************************************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
master : ok=584 changed=109 unreachable=0 failed=0 skipped=1160 rescued=0 ignored=1
slave1 : ok=520 changed=97 unreachable=0 failed=0 skipped=1008 rescued=0 ignored=0
slave2 : ok=438 changed=76 unreachable=0 failed=0 skipped=678 rescued=0 ignored=0
Saturday 31 December 2022 20:07:57 +0800 (0:00:00.060) 0:59:12.196 *****
===============================================================================
container-engine/docker : ensure docker packages are installed ----------------------------------------------- 2180.79s
kubernetes/preinstall : Install packages requirements --------------------------------------------------------- 487.24s
download_file | Download item ---------------------------------------------------------------------------------- 58.95s
download_file | Download item ---------------------------------------------------------------------------------- 50.40s
download_container | Download image if required ---------------------------------------------------------------- 44.25s
download_file | Download item ---------------------------------------------------------------------------------- 42.65s
download_container | Download image if required ---------------------------------------------------------------- 38.06s
download_container | Download image if required ---------------------------------------------------------------- 32.38s
kubernetes/kubeadm : Join to cluster --------------------------------------------------------------------------- 32.29s
download_container | Download image if required ---------------------------------------------------------------- 30.67s
download_file | Download item ---------------------------------------------------------------------------------- 25.82s
kubernetes/control-plane : Joining control plane node to the cluster. ------------------------------------------ 25.60s
download_container | Download image if required ---------------------------------------------------------------- 25.34s
download_container | Download image if required ---------------------------------------------------------------- 22.49s
kubernetes/control-plane : kubeadm | Initialize first master --------------------------------------------------- 20.90s
download_container | Download image if required ---------------------------------------------------------------- 20.14s
download_file | Download item ---------------------------------------------------------------------------------- 19.50s
download_container | Download image if required ---------------------------------------------------------------- 17.84s
download_container | Download image if required ---------------------------------------------------------------- 13.96s
download_container | Download image if required ---------------------------------------------------------------- 13.31s
2.12 查看创建的集群
# master节点执行
[root@master ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane,master 10m v1.20.7 192.168.54.211 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
slave1 Ready control-plane,master 9m38s v1.20.7 192.168.54.212 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
slave2 Ready <none> 8m40s v1.20.7 192.168.54.213 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
[root@master ~]# kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7c5b64bf96-wtmxn 1/1 Running 0 8m41s
calico-node-c6rr6 1/1 Running 0 9m6s
calico-node-l59fj 1/1 Running 0 9m6s
calico-node-n9tg6 1/1 Running 0 9m6s
coredns-f944c7f7c-n2wzp 1/1 Running 0 8m26s
coredns-f944c7f7c-x2tfl 1/1 Running 0 8m22s
dns-autoscaler-557bfb974d-6cbtk 1/1 Running 0 8m24s
kube-apiserver-master 1/1 Running 0 10m
kube-apiserver-slave1 1/1 Running 0 10m
kube-controller-manager-master 1/1 Running 0 10m
kube-controller-manager-slave1 1/1 Running 0 10m
kube-proxy-czk9s 1/1 Running 0 9m17s
kube-proxy-gwfc8 1/1 Running 0 9m17s
kube-proxy-tkxlf 1/1 Running 0 9m17s
kube-scheduler-master 1/1 Running 0 10m
kube-scheduler-slave1 1/1 Running 0 10m
nginx-proxy-slave2 1/1 Running 0 9m18s
nodelocaldns-4vd75 1/1 Running 0 8m23s
nodelocaldns-cr5gg 1/1 Running 0 8m23s
nodelocaldns-pmgqx 1/1 Running 0 8m23s
2.13 查看安装的镜像
# master节点执行
[root@master ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.20.7 ff54c88b8ecf 19 months ago 118MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.20.7 22d1a2072ec7 19 months ago 116MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.20.7 034671b24f0f 19 months ago 122MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.20.7 38f903b54010 19 months ago 47.3MB
nginx 1.19 f0b8a9a54136 19 months ago 133MB
quay.io/calico/node v3.17.4 4d9399da41dc 20 months ago 165MB
quay.io/calico/cni v3.17.4 f3abd83bc819 20 months ago 128MB
quay.io/calico/kube-controllers v3.17.4 c623a89d3672 20 months ago 52.2MB
cncamp/k8s-dns-node-cache 1.17.1 21fc69048bd5 22 months ago 123MB
quay.io/coreos/etcd v3.4.13 d1985d404385 2 years ago 83.8MB
cncamp/cluster-proportional-autoscaler-amd64 1.8.3 078b6f04135f 2 years ago 40.6MB
registry.aliyuncs.com/google_containers/coredns 1.7.0 bfe3a36ebd25 2 years ago 45.2MB
registry.aliyuncs.com/google_containers/pause 3.3 0184c1613d92 2 years ago 683kB
registry.aliyuncs.com/google_containers/pause 3.2 80d28bedfe5d 2 years ago 683kB
# slave1节点执行
[root@slave1 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.20.7 ff54c88b8ecf 19 months ago 118MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.20.7 034671b24f0f 19 months ago 122MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.20.7 22d1a2072ec7 19 months ago 116MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.20.7 38f903b54010 19 months ago 47.3MB
nginx 1.19 f0b8a9a54136 19 months ago 133MB
quay.io/calico/node v3.17.4 4d9399da41dc 20 months ago 165MB
quay.io/calico/cni v3.17.4 f3abd83bc819 20 months ago 128MB
quay.io/calico/kube-controllers v3.17.4 c623a89d3672 20 months ago 52.2MB
cncamp/k8s-dns-node-cache 1.17.1 21fc69048bd5 22 months ago 123MB
quay.io/coreos/etcd v3.4.13 d1985d404385 2 years ago 83.8MB
cncamp/cluster-proportional-autoscaler-amd64 1.8.3 078b6f04135f 2 years ago 40.6MB
registry.aliyuncs.com/google_containers/coredns 1.7.0 bfe3a36ebd25 2 years ago 45.2MB
registry.aliyuncs.com/google_containers/pause 3.3 0184c1613d92 2 years ago 683kB
registry.aliyuncs.com/google_containers/pause 3.2 80d28bedfe5d 2 years ago 683kB
# slave2节点执行
[root@slave2 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.20.7 ff54c88b8ecf 19 months ago 118MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.20.7 034671b24f0f 19 months ago 122MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.20.7 22d1a2072ec7 19 months ago 116MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.20.7 38f903b54010 19 months ago 47.3MB
nginx 1.19 f0b8a9a54136 19 months ago 133MB
quay.io/calico/node v3.17.4 4d9399da41dc 20 months ago 165MB
quay.io/calico/cni v3.17.4 f3abd83bc819 20 months ago 128MB
quay.io/calico/kube-controllers v3.17.4 c623a89d3672 20 months ago 52.2MB
cncamp/k8s-dns-node-cache 1.17.1 21fc69048bd5 22 months ago 123MB
quay.io/coreos/etcd v3.4.13 d1985d404385 2 years ago 83.8MB
registry.aliyuncs.com/google_containers/pause 3.3 0184c1613d92 2 years ago 683kB
导出镜像供离线使用:
# master节点执行
docker save -o kube-proxy.tar registry.aliyuncs.com/google_containers/kube-proxy:v1.20.7
docker save -o kube-controller-manager.tar registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.7
docker save -o kube-apiserver.tar registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.7
docker save -o kube-scheduler.tar registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.7
docker save -o nginx.tar nginx:1.19
docker save -o node.tar quay.io/calico/node:v3.17.4
docker save -o cni.tar quay.io/calico/cni:v3.17.4
docker save -o kube-controllers.tar quay.io/calico/kube-controllers:v3.17.4
docker save -o k8s-dns-node-cache.tar cncamp/k8s-dns-node-cache:1.17.1
docker save -o etcd.tar quay.io/coreos/etcd:v3.4.13
docker save -o cluster-proportional-autoscaler-amd64.tar cncamp/cluster-proportional-autoscaler-amd64:1.8.3
docker save -o coredns.tar registry.aliyuncs.com/google_containers/coredns:1.7.0
docker save -o pause_3.3.tar registry.aliyuncs.com/google_containers/pause:3.3
docker save -o pause_3.2.tar registry.aliyuncs.com/google_containers/pause:3.2
查看生成的文件:
# master节点执行
[root@master ~]# tree Kubespray-2.16.0/
Kubespray-2.16.0/
├── calicoctl
├── cni-plugins-linux-amd64-v0.9.1.tgz
├── images
│ ├── cluster-proportional-autoscaler-amd64.tar
│ ├── cni.tar
│ ├── coredns.tar
│ ├── etcd.tar
│ ├── k8s-dns-node-cache.tar
│ ├── kube-apiserver.tar
│ ├── kube-controller-manager.tar
│ ├── kube-controllers.tar
│ ├── kube-proxy.tar
│ ├── kube-scheduler.tar
│ ├── nginx.tar
│ ├── node.tar
│ ├── pause_3.2.tar
│ └── pause_3.3.tar
├── kubeadm-v1.20.7-amd64
├── kubectl-v1.20.7-amd64
├── kubelet-v1.20.7-amd64
└── rpm
├── docker
│ ├── audit-libs-python-2.8.5-4.el7.x86_64.rpm
│ ├── b001-libsemanage-python-2.5-14.el7.x86_64.rpm
│ ├── b002-setools-libs-3.3.8-4.el7.x86_64.rpm
│ ├── b003-libcgroup-0.41-21.el7.x86_64.rpm
│ ├── b0041-checkpolicy-2.5-8.el7.x86_64.rpm
│ ├── b004-python-IPy-0.75-6.el7.noarch.rpm
│ ├── b005-policycoreutils-python-2.5-34.el7.x86_64.rpm
│ ├── b006-container-selinux-2.119.2-1.911c772.el7_8.noarch.rpm
│ ├── b007-containerd.io-1.3.9-3.1.el7.x86_64.rpm
│ ├── d001-docker-ce-cli-19.03.14-3.el7.x86_64.rpm
│ ├── d002-docker-ce-19.03.14-3.el7.x86_64.rpm
│ └── d003-libseccomp-2.3.1-4.el7.x86_64.rpm
└── preinstall
├── a001-libseccomp-2.3.1-4.el7.x86_64.rpm
├── bash-completion-2.1-8.el7.noarch.rpm
├── chrony-3.4-1.el7.x86_64.rpm
├── e2fsprogs-1.42.9-19.el7.x86_64.rpm
├── ebtables-2.0.10-16.el7.x86_64.rpm
├── ipset-7.1-1.el7.x86_64.rpm
├── ipvsadm-1.27-8.el7.x86_64.rpm
├── rsync-3.1.2-10.el7.x86_64.rpm
├── socat-1.7.3.2-2.el7.x86_64.rpm
├── unzip-6.0-22.el7_9.x86_64.rpm
├── wget-1.14-18.el7_6.1.x86_64.rpm
└── xfsprogs-4.5.0-22.el7.x86_64.rpm
4 directories, 43 files
2.14 卸载集群
卸载集群:
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root reset.yml
2.15 添加节点
1、在 inventory/mycluster/hosts.yaml 中添加新增节点信息
2、执行下面的命令:
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root scale.yml -v -b --private-key=~/.ssh/id_rsa
2.16 移除节点
不用修改 hosts.yaml 文件,而是直接执行下面的命令:
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root remove-node.yml -v -b --extra-vars "node=slave1"
2.17 升级集群
[root@master kubespray-2.16.0]# ansible-playbook upgrade-cluster.yml -b -i inventory/mycluster/hosts.yaml -e kube_version=v1.22.0
3、离线部署
在线部署可能因为网络的原因导致部署失败,所以可以使用离线部署 k8s 集群。
下面是从网上看到的一个离线部署的例子。
kubespray GitHub 地址为: https://github.com/kubernetes-sigs/kubespray
这里使用分支为 release-2.15,对应的主要组件和系统版本如下:
-
kubernetes v1.19.10 -
docker v19.03 -
calico v3.16.9 -
centos 7.9.2009
kubespray 离线包下载地址:
https://www.mediafire.com/file/nyifoimng9i6zp5/kubespray_offline.tar.gz/file
离线包下载完成后解压到 /opt 目录下:
# master节点执行
[root@master opt]# tar -zxvf /opt/kubespray_offline.tar.gz -C /opt/
查看文件列表:
# master节点执行
[root@master opt]# ll /opt/kubespray_offline
总用量 4
drwxr-xr-x. 4 root root 28 7月 11 2021 ansible_install
drwxr-xr-x. 15 root root 4096 7月 8 2021 kubespray
drwxr-xr-x. 4 root root 240 7月 9 2021 kubespray_cache
三台机器的IP地址为:192.168.43.211、192.168.43.212 和 192.168.43.213。
开始部署 ansible 服务器:
# master节点执行
[root@master opt]# yum install /opt/kubespray_offline/ansible_install/rpm/*
[root@master opt]# pip3 install /opt/kubespray_offline/ansible_install/pip/*
配置主机免密码登陆:
# master节点执行
[root@master ~]# ssh-keygen
[root@master ~]# ssh-copy-id 192.168.43.211
[root@master ~]# ssh-copy-id 192.168.43.212
[root@master ~]# ssh-copy-id 192.168.43.213
[root@master ~]# ssh-copy-id master
[root@master ~]# ssh-copy-id slave1
[root@master ~]# ssh-copy-id slave2
配置 ansible 主机组:
# master节点执行
[root@master opt]# cd /opt/kubespray_offline/kubespray
[root@master kubespray]# declare -a IPS=(192.168.43.211 192.168.43.212 192.168.43.213)
[root@master kubespray]# CONFIG_FILE=inventory/mycluster/hosts.yaml python3.6 contrib/inventory_builder/inventory.py ${IPS[@]}
DEBUG: Adding group all
DEBUG: Adding group kube-master
DEBUG: Adding group kube-node
DEBUG: Adding group etcd
DEBUG: Adding group k8s-cluster
DEBUG: Adding group calico-rr
DEBUG: adding host node1 to group all
DEBUG: adding host node2 to group all
DEBUG: adding host node3 to group all
DEBUG: adding host node1 to group etcd
DEBUG: adding host node2 to group etcd
DEBUG: adding host node3 to group etcd
DEBUG: adding host node1 to group kube-master
DEBUG: adding host node2 to group kube-master
DEBUG: adding host node1 to group kube-node
DEBUG: adding host node2 to group kube-node
DEBUG: adding host node3 to group kube-node
inventory/mycluster/hosts.yaml 文件会自动生成。
修改 inventory/mycluster/hosts.yaml 文件:
# master节点执行
[root@master kubespray]# vim inventory/mycluster/hosts.yaml
all:
hosts:
master:
ansible_host: 192.168.43.211
ip: 192.168.43.211
access_ip: 192.168.43.211
slave1:
ansible_host: 192.168.43.212
ip: 192.168.43.212
access_ip: 192.168.43.212
slave2:
ansible_host: 192.168.43.213
ip: 192.168.43.213
access_ip: 192.168.43.213
children:
kube-master:
hosts:
master:
slave1:
kube-node:
hosts:
master:
slave1:
slave2:
etcd:
hosts:
master:
slave1:
slave2:
k8s-cluster:
children:
kter:
kube-node:
calico-rr:
hosts: {}
修改配置文件使用离线的安装包和镜像:
# master节点执行
[root@master kubespray]# vim inventory/mycluster/group_vars/all/all.yml
---
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd
## Experimental kubeadm etcd deployment mode. Available only for new deployment
etcd_kubeadm_enabled: false
## Directory where the binaries will be installed
bin_dir: /usr/local/bin
## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
# access_ip: 1.1.1.1
## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
# loadbalancer_apiserver:
# address: 1.2.3.4
# port: 1234
## Internal loadbalancers for apiservers
# loadbalancer_apiserver_localhost: true
# valid options are "nginx" or "haproxy"
# loadbalancer_apiserver_type: nginx # valid values "nginx" or "haproxy"
## If the cilium is going to be used in strict mode, we can use the
## localhost connection and not use the external LB. If this parameter is
## not specified, the first node to connect to kubeapi will be used.
# use_localhost_as_kubeapi_loadbalancer: true
## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443
## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081
### OTHER OPTIONAL VARIABLES
## Upstream dns servers
# upstream_dns_servers:
# - 8.8.8.8
# - 8.8.4.4
## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', 'oci', or 'external'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using openstack-client before starting the playbook.
# cloud_provider:
## When cloud_provider is set to 'external', you can set the cloud controller to deploy
## Supported cloud controllers are: 'openstack' and 'vsphere'
## When openstack or vsphere are used make sure to source in the required fields
# external_cloud_provider:
## Set these proxy values in order to update package manager and docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""
#
## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
# no_proxy: ""
## Some problems may occur when downloading files over https proxy due to ansible bug
## https://github.com/ansible/ansible/issues/32750. Set this variable to False to disable
## SSL validation of get_url module. Note that kubespray will still be performing checksum validation.
# download_validate_certs: False
## If you need exclude all cluster nodes from proxy and other resources, add other resources here.
# additional_no_proxy: ""
## If you need to disable proxying of os package repositories but are still behind an http_proxy set
## skip_http_proxy_on_os_packages to true
## This will cause kubespray not to set proxy environment in /etc/yum.conf for centos and in /etc/apt/apt.conf for debian/ubuntu
## Special information for debian/ubuntu - you have to set the no_proxy variable, then apt package will install from your source of wish
# skip_http_proxy_on_os_packages: false
## Since workers are included in the no_proxy variable by default, docker engine will be restarted on all nodes (all
## pods will restart) when adding or removing workers. To override this behaviour by only including master nodes in the
## no_proxy variable, set below to true:
no_proxy_exclude_workers: false
## Certificate Management
## This setting determines whether certs are generated via scripts.
## Chose 'none' if you provide your own certificates.
## Option is "script", "none"
## note: vault is removed
# cert_management: script
## Set to true to allow pre-checks to fail and continue deployment
# ignore_assert_errors: false
## The read-only port for the Kubelet to serve on with no authentication/authorization. Uncomment to enable.
# kube_read_only_port: 10255
## Set true to download and cache container
# download_container: true
## Deploy container engine
# Set false if you want to deploy container engine manually.
# deploy_container_engine: true
## Red Hat Enterprise Linux subscription registration
## Add either RHEL subscription Username/Password or Organization ID/Activation Key combination
## Update RHEL subscription purpose usage, role and SLA if necessary
# rh_subscription_username: ""
# rh_subscription_password: ""
# rh_subscription_org_id: ""
# rh_subscription_activation_key: ""
# rh_subscription_usage: "Development"
# rh_subscription_role: "Red Hat Enterprise Server"
# rh_subscription_sla: "Self-Support"
## Check if access_ip responds to ping. Set false if your firewall blocks ICMP.
# ping_access_ip: true
kube_apiserver_node_port_range: "1-65535"
kube_apiserver_node_port_range_sysctl: false
download_run_once: true
download_localhost: true
download_force_cache: true
download_cache_dir: /opt/kubespray_offline/kubespray_cache # 修改
preinstall_cache_rpm: true
docker_cache_rpm: true
download_rpm_localhost: "{{ download_cache_dir }}/rpm" # 修改
tmp_cache_dir: /tmp/k8s_cache # 修改
tmp_preinstall_rpm: "{{ tmp_cache_dir }}/rpm/preinstall" # 修改
tmp_docker_rpm: "{{ tmp_cache_dir }}/rpm/docker" # 修改
image_is_cached: true
nodelocaldns_dire_coredns: true
开始部署 k8s:
# master节点执行
[root@master kubespray]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
......
PLAY RECAP ****************************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
master : ok=762 changed=178 unreachable=0 failed=0 skipped=1060 rescued=0 ignored=1
slave1 : ok=648 changed=159 unreachable=0 failed=0 skipped=918 rescued=0 ignored=0
slave2 : ok=462 changed=104 unreachable=0 failed=0 skipped=584 rescued=0 ignored=0
Sunday 18 June 2023 12:36:42 +0800 (0:00:00.059) 0:13:16.912 ***********
===============================================================================
kubernetes/master : kubeadm | Initialize first master ------------------------------------ 136.25s
kubernetes/master : Joining control plane node to the cluster. --------------------------- 110.63s
kubernetes/kubeadm : Join to cluster ------------------------------------------------------ 37.66s
container-engine/docker : Install packages docker with local rpm|Install RPM -------------- 29.70s
download_container | Load image into docker ----------------------------------------------- 11.72s
reload etcd ------------------------------------------------------------------------------- 10.62s
Gen_certs | Write etcd master certs -------------------------------------------------------- 9.13s
Gen_certs | Write etcd master certs -------------------------------------------------------- 8.84s
kubernetes/master : Master | wait for kube-scheduler --------------------------------------- 8.03s
download_container | Load image into docker ------------------------------------------------ 7.07s
download_container | Upload image to node if it is cached ---------------------------------- 6.72s
download_container | Load image into docker ------------------------------------------------ 6.69s
download_container | Load image into docker ------------------------------------------------ 6.36s
kubernetes/preinstall : Install packages requirements with local rpm|Install RPM ----------- 6.00s
wait for etcd up --------------------------------------------------------------------------- 5.76s
download_file | Copy file from cache to nodes, if it is available -------------------------- 5.64s
download_container | Load image into docker ------------------------------------------------ 5.57s
network_plugin/calico : Wait for calico kubeconfig to be created --------------------------- 5.37s
Configure | Check if etcd cluster is healthy ----------------------------------------------- 5.25s
kubernetes-apps/ansible : Kubernetes Apps | Start Resources -------------------------------- 5.16s
部署时间大概持续半个小时,中间不需要任何介入,部署完成后,查看集群和Pod状态:
# master节点执行
[root@master kubespray]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready master 5m46s v1.19.10 192.168.43.211 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.14
slave1 Ready master 3m50s v1.19.10 192.168.43.212 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.14
slave2 Ready <none> 2m49s v1.19.10 192.168.43.213 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
[root@master kubespray]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7fbf9b4bbb-nw7j5 1/1 Running 0 4m24s
calico-node-8bhct 1/1 Running 0 4m45s
calico-node-rbkls 1/1 Running 0 4m45s
calico-node-svphr 1/1 Running 0 4m45s
coredns-7677f9bb54-j8xx5 1/1 Running 0 3m58s
coredns-7677f9bb54-tzzpp 1/1 Running 0 4m2s
dns-autoscaler-5b7b5c9b6f-mx9dv 1/1 Running 0 4m
k8dash-77959656b-vsqfq 1/1 Running 0 3m55s
kube-apiserver-master 1/1 Running 0 7m47s
kube-apiserver-slave1 1/1 Running 0 5m58s
kube-controller-manager-master 1/1 Running 0 7m47s
kube-controller-manager-slave1 1/1 Running 0 5m58s
kube-proxy-ktvmd 1/1 Running 0 4m56s
kube-proxy-rcnhc 1/1 Running 0 4m56s
kube-proxy-slc7z 1/1 Running 0 4m56s
kube-scheduler-master 1/1 Running 0 7m47s
kube-scheduler-slave1 1/1 Running 0 5m58s
kubernetes-dashboard-758979f44b-xfw8x 1/1 Running 0 3m57s
kubernetes-metrics-scraper-678c97765c-k7z5c 1/1 Running 0 3m56s
metrics-server-8676bf5f99-nkrjr 1/1 Running 0 3m39s
nginx-proxy-slave2 1/1 Running 0 4m57s
nodelocaldns-bxww2 1/1 Running 0 3m58s
nodelocaldns-j2hvc 1/1 Running 0 3m58s
nodelocaldns-p2nx8 1/1 Running 0 3m58s
验证集群:
# master节点执行
# nginx安装
# 创建一个nginx镜像
[root@master ~]# kubectl create deployment nginx --image=nginx
deployment.apps/nginx created
# master节点执行
# 设置对外暴露端口
[root@master ~]# kubectl expose deployment nginx --port=80 --type=NodePort
service/nginx exposed
# master节点执行
[root@master ~]# kubectl get pods,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-6799fc88d8-4t4mv 1/1 Running 0 72s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 11m
service/nginx NodePort 10.233.31.130 <none> 80:22013/TCP 53s
# master节点执行
# 发送curl请求
[root@master ~]# curl http://192.168.43.211:22013/
[root@master ~]# curl http://192.168.43.212:22013/
[root@master ~]# curl http://192.168.43.213:22013/
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
说明集群没有问题。
卸载集群:
# master节点执行
[root@master kubespray]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root reset.yml
......
PLAY RECAP ****************************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
master : ok=31 changed=17 unreachable=0 failed=0 skipped=24 rescued=0 ignored=0
slave1 : ok=30 changed=17 unreachable=0 failed=0 skipped=19 rescued=0 ignored=0
slave2 : ok=30 changed=17 unreachable=0 failed=0 skipped=19 rescued=0 ignored=0
Sunday 18 June 2023 12:46:14 +0800 (0:00:01.135) 0:00:49.596 ***********
===============================================================================
Gather necessary facts (hardware) --------------------------------------------------------- 21.52s
reset | delete some files and directories ------------------------------------------------- 10.41s
reset | unmount kubelet dirs --------------------------------------------------------------- 1.73s
reset | remove all containers -------------------------------------------------------------- 1.63s
reset | remove services -------------------------------------------------------------------- 1.63s
reset | Restart network -------------------------------------------------------------------- 1.14s
download | Download files / images --------------------------------------------------------- 1.02s
reset : flush iptables --------------------------------------------------------------------- 0.89s
reset | stop services ---------------------------------------------------------------------- 0.80s
reset | restart docker if needed ----------------------------------------------------------- 0.77s
reset | remove docker dropins -------------------------------------------------------------- 0.76s
reset | remove remaining routes set by bird ------------------------------------------------ 0.57s
reset | stop etcd services ----------------------------------------------------------------- 0.53s
Gather minimal facts ----------------------------------------------------------------------- 0.48s
Gather necessary facts (network) ----------------------------------------------------------- 0.46s
reset | remove dns settings from dhclient.conf --------------------------------------------- 0.44s
reset | remove etcd services --------------------------------------------------------------- 0.41s
reset | systemctl daemon-reload ------------------------------------------------------------ 0.41s
reset | check if crictl is present --------------------------------------------------------- 0.30s
reset | Remove kube-ipvs0 ------------------------------------------------------------------ 0.25s
至此,离线的 k8s 集群搭建完毕。
更多推荐


所有评论(0)