k8s部署详细步骤,kubeadm和二进制
1.kubeadm部署https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/虚拟机环境:机器IPRole系统192.168.17.140masterUbuntu 18.04.5 LTS amd64192.168.17.141nodeUbuntu 18.04.5 LTS amd641
1.kubeadm部署
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
虚拟机环境:
机器IP | Role | 系统 |
---|---|---|
192.168.17.140 | master | Ubuntu 18.04.5 LTS amd64 |
192.168.17.141 | node | Ubuntu 18.04.5 LTS amd64 |
192.168.17.142 | node | Ubuntu 18.04.5 LTS amd64 |
版本:
k8s: v1.23.5
docker: 20.10.14
kuboard: v3
安装步骤:
-
安装虚拟机
新建一个ubuntu虚拟机,安装好后复制两份虚拟机文件。打开复制的虚拟机时,根据提示选择“我已经复制虚拟机”,vm会自动为复制的虚拟机生成新的ip和mac,无需手动配置。
-
准备工作(在所有节点上操作)
修改主机名:
永久修改主机名,修改 /etc/hostname,重启生效
sudo vi /etc/hostname
启动后安装ssh server
sudo apt-get install openssh-server sudo service sshd status
安装docker:https://docs.docker.com/engine/install/ubuntu/
配置docker开机自启:
sudo systemctl enable docker
修改docker配置,cgroup dirvers为systemd
sudo tee /etc/docker/daemon.json <<-'EOF' { "registry-mirrors": ["https://vmjo8s1n.mirror.aliyuncs.com"], "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m" }, "storage-driver": "overlay2" } EOF sudo systemctl daemon-reload sudo systemctl restart docker
通过命令sudo docker info,查看Cgroup Driver: systemd是否正确。
配置sudo免密
sudo visudo
末尾添加 用户名 ALL=NOPASSWD:ALL
lijian ALL=NOPASSWD:ALL
swap禁用:
sudo vi /etc/fstab
注释掉 /swapfile none swap sw 0 0,
重启系统, 验证是否关闭,没有输出就是已关闭
sudo swapon --show
配置br_netfilter模块
sudo modprobe br_netfilter lsmod | grep br_netfilter
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf br_netfilter EOF cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF sudo sysctl --system
安装其他模块:
sudo apt install ebtables ethtool sudo apt-get install socat sudo apt-get install conntrack
-
安装部署工具kubeadm、Kubelet、kubectl(在所有节点上操作)
- kubeadm 安装在master上,用来部署和启动集群
- kubelet 部署在所有节点上,用来维护节点
- kubectl 是操作集群的命令行工具,一般安装在master上就够用了,
安装CNI插件,CNI插件用于pod网络通信
CNI_VERSION="v0.8.2" ARCH="amd64" sudo mkdir -p /opt/cni/bin curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-linux-${ARCH}-${CNI_VERSION}.tgz" | sudo tar -C /opt/cni/bin -xz
安装crictl, 用于Kubeadm和kubelet操作容器运行接口
DOWNLOAD_DIR=/usr/local/bin sudo mkdir -p $DOWNLOAD_DIR CRICTL_VERSION="v1.22.0" ARCH="amd64" curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-${ARCH}.tar.gz" | sudo tar -C $DOWNLOAD_DIR -xz
安装kubeadm,kubelet,kubectl并将kubelet添加到系统服务
RELEASE="$(curl -sSL https://dl.k8s.io/release/stable.txt)" ARCH="amd64" cd $DOWNLOAD_DIR sudo curl -L --remote-name-all https://storage.googleapis.com/kubernetes-release/release/${RELEASE}/bin/linux/${ARCH}/{kubeadm,kubelet,kubectl} sudo chmod +x {kubeadm,kubelet,kubectl} RELEASE_VERSION="v0.4.0" curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service sudo mkdir -p /etc/systemd/system/kubelet.service.d curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
启动kubelet
systemctl enable --now kubelet
此时kubelet每隔几秒会重新启动,处于crashloop状态 ,等待kubeadm通知它怎么做,这是正常的。
-
部署master(master节点操作)
为了加快初始化速度以及墙的缘故,提前在本地准备好镜像,查看所需要的镜像
kubeadm config images list
使用阿里云镜像仓库下载,执行images_pull.sh,根据实际部署版本修改脚本内容,镜像需要在所有节点上都准备好
lijian@vm2:~$ cat images_pull.sh #! /bin/bash images=( kube-apiserver:v1.23.5 kube-controller-manager:v1.23.5 kube-scheduler:v1.23.5 kube-proxy:v1.23.5 pause:3.6 etcd:3.5.1-0 coredns:v1.8.6 ) for imageName in ${images[@]} ; do docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName k8s.gcr.io/$imageName docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName done docker tag k8s.gcr.io/coredns:v1.8.6 k8s.gcr.io/coredns/coredns:v1.8.6
初始化master节点:
kubeadm init
kubeadm init会先预检环境是否满足需求,如果不满足需求会退出初始化,需要先解决报错问题,再重新运行init。重新运行kubeadm init之前先清理环境:
kubeadm reset
初始化成功后,显示如下信息:
Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a Pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: /docs/concepts/cluster-administration/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join <control-plane-host>:<control-plane-port> --token <token> --discovery-token-ca-cert-hash sha256:<hash>
最下面的kubeadm join …这句要保存好,一会添加节点要用到。
如果是非root用户使用kubectl ,执行:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
如果是root用户
export KUBECONFIG=/etc/kubernetes/admin.conf
安装pod网络插件:
可以选用其中任何一种
https://kubernetes.io/docs/concepts/cluster-administration/networking/#how-to-implement-the-kubernetes-networking-model
我用的是flannel:
https://github.com/flannel-io/flannel#flannel
执行
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
查看网络插件是否安装成功
kubectl get pods --all-namespaces
如果CoreDNS Pod 是running状态即为成功,可以进行添加node节点了,否则要查看问题在哪里。
-
加入node(在节点机器上操作)
在节点机器上执行kubeadm init输出的kubeadm Join命令
kubeadm join --token <token> <control-plane-host>:<control-plane-port> --discovery-token-ca-cert-hash sha256:<hash>
输出如下:
[preflight] Running pre-flight checks ... (log output of join workflow) ... Node join complete: * Certificate signing request sent to control-plane and response received. * Kubelet informed of new secure connection details. Run 'kubectl get nodes' on control-plane to see this machine join.
-
安装kuboard
https://kuboard.cn/install/v3/install-in-k8s.html
kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml 或 kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3-swr.yaml kubectl get pods -n kuboard
就绪后打开http://your-node-ip-address:30080,用户名admin,密码Kuboard123
-
ToubleShooting
flannel报错,CrashLoopBackOff,报错信息:
Error registering network: failed to acquire lease: node “k8s-master-1” pod cidr not assigned
报错原因/解决方法:
1.安装Kubeadm Init的时候,没有增加 --pod-network-cidr 10.244.0.0/16参数
注意,安装Flannel时,kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml如果yml中的"Network": "10.244.0.0/16"和init 时传入的–pod-network-cidr不一样,就修改成一样的。不然可能会使得Node间Cluster IP不通。2.kube-controller-manager 没有给新加入的节点分配IP段.
编辑 master 机器上的 /etc/kubernetes/manifests/kube-controller-manager.yaml文件加上下面两句:
–allocate-node-cidrs=true
–cluster-cidr=10.244.0.0/16
这个cluster-cidr要和kube-flannel.yml里面的地址一致,并要和kube-proxy.config.yaml里面的clusterCIDR一致
- command:
- kube-controller-manager
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --controllers=*,bootstrapsigner,tokencleaner
- --kubeconfig=/etc/kubernetes/controller-manager.conf
- --leader-elect=true
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --root-ca-file=/etc/kubernetes/pki/ca.crt
- --service-account-private-key-file=/etc/kubernetes/pki/sa.key
- --use-service-account-credentials=true
- --allocate-node-cidrs=true
- --cluster-cidr=10.244.0.0/16
如果你还没有还原快照(kubeadm reset),采用方法二,然后 kubectl delete pod -n kube-system kube-flannel-*,将三个错误的 flannel-pod删除,即可自动重新创建新的 flannel-pod。
如果你恢复快照了,那么在 kubeadm init 时加上 --pod-network-cidr=10.244.0.0/16 参数即可。
添加节点:
kubeadm join 192.168.17.140:6443 --token 0ctfq1.6v9bkoxg4u8muhqx \
--discovery-token-ca-cert-hash sha256:8e300dc48b25249b72aa365470aacb91ebec5ea240d992ef4a8242fb4a47a9b4
添加节点后,节点NotReady
container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
解决方法:创建cni网络相关配置文件,从master节点上拷一套过来
sudo mkdir -p /run/flannel/
sudo scp lijian@vm2:/run/flannel/subnet.env /run/flannel/subnet.env
sudo mkdir -p /etc/cni/net.d
sudo scp lijian@vm2:/etc/cni/net.d/10-flannel.conflist /etc/cni/net.d/
其他问题参考:https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm/
2.二进制方式部署
参考(由主到次):
https://www.kubernetes.org.cn/4963.html
controlelr-manger和schuduler认证 :https://blog.csdn.net/wangshui898/article/details/120132028
coredns: https://blog.csdn.net/weixin_45444133/article/details/116405713
下载链接:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#server-binaries
https://github.com/etcd-io/etcd/releases
https://github.com/flannel-io/flannel/releases
https://docs.docker.com/engine/install/ubuntu/
https://github.com/coredns/deployment
目录安排:
/opt/k8s/ 安装包
/opt/k8s/*-cert 证书配置文件
/k8s/etcd etcd部署目录 ,bin 二进制文件,cfg配置文件,ssl证书文件
/k8s/kubernetes k8s组件部署目录,bin 二进制文件,cfg配置文件,ssl证书文件
环境:
Role | ip | 系统 |
---|---|---|
master | 192.168.17.143 | Ubuntu 18.04.5 LTS amd64 |
node1 | 192.168.17.144 | Ubuntu 18.04.5 LTS amd64 |
node2 | 192.168.17.145 | Ubuntu 18.04.5 LTS amd64 |
etcd集群部署三台节点
master节点部署控制组件和kubelet、kube-proxy、flannel
node节点部署kubelet、kube-proxy、flannel
–cluster-cidr=10.244.0.0/16
用于flannel配置中的Network、kube-controller-manager的配置字段–cluster-cidr、kube-proxy配置中的字段clusterCIDR
操作系统: ubuntu 18.04
版本:
software | version |
---|---|
k8s | 1.23.5 |
etcd | 3.5.3 |
flannel | 0.17.0 |
docker | 20.10.14 |
coredns | 1.14.0 |
1.初始化环境
-
安装虚拟机
新建一个ubuntu虚拟机,安装好后复制两份虚拟机文件。打开复制的虚拟机时,根据提示选择“我已经复制虚拟机”,vm会自动为复制的虚拟机生成新的ip和mac,无需手动配置。
-
准备工作(在所有节点上操作)
修改主机名:
永久修改主机名,修改 /etc/hostname,重启生效
sudo vi /etc/hostname
启动后安装ssh server
sudo apt-get install openssh-server sudo service sshd status
关闭swap
sudo vi /etc/fstab
注释掉 /swapfile none swap sw 0 0,
重启系统, 验证是否关闭,没有输出就是已关闭
sudo swapon --show
设置Docker所需参数
cat << EOF | tee /etc/sysctl.d/k8s.conf net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF sysctl -p /etc/sysctl.d/k8s.conf
安装docker:https://docs.docker.com/engine/install/ubuntu/
配置docker开机自启:
sudo systemctl enable docker
修改docker配置,cgroup dirvers为systemd
sudo tee /etc/docker/daemon.json <<-'EOF' { "registry-mirrors": ["https://vmjo8s1n.mirror.aliyuncs.com"], "exec-opts": ["native.cgroupdriver=systemd"] } EOF sudo systemctl daemon-reload sudo systemctl restart docker
通过命令sudo docker info,查看Cgroup Driver: systemd是否正确。
普通用户配置sudo免密
sudo visudo
末尾添加 用户名 ALL=NOPASSWD:ALL
lijian ALL=NOPASSWD:ALL
创建安装目录
mkdir /k8s/etcd/{bin,cfg,ssl} -p mkdir /k8s/kubernetes/{bin,cfg,ssl} -p
安装及配置cfssl
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64 mv cfssl_linux-amd64 /usr/local/bin/cfssl mv cfssljson_linux-amd64 /usr/local/bin/cfssljson mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
创建ETCD认证证书
lijian@vma:/opt/k8s/etcd-cert$ cat ca-config.json { "signing": { "default": { "expiry": "87600h" }, "profiles": { "www": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } }
创建 ETCD CA 配置文件
lijian@vma:/opt/k8s/etcd-cert$ cat ca-csr.json { "CN": "etcd CA", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "suzhou", "ST": "suzhou" } ] }
创建 ETCD Server 证书
lijian@vma:/opt/k8s/etcd-cert$ cat server-csr.json { "CN": "etcd", "hosts": [ "192.168.17.143", "192.168.17.144", "192.168.17.145" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "suzhou", "ST": "suzhou" } ] }
生成 ETCD CA 证书和私钥
cfssl gencert -initca ca-csr.json | cfssljson -bare ca - cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server
把etcd的证书拷贝到/opt/k8s/kubernetes-cert目录下:
lijian@vma:/opt/k8s/kubernetes-cert$ ls -lrt /k8s/etcd/ssl/ total 16 -rw-rw-r-- 1 lijian lijian 1334 4月 14 17:04 server.pem -rw------- 1 lijian lijian 1675 4月 14 17:04 server-key.pem -rw-rw-r-- 1 lijian lijian 1261 4月 14 17:04 ca.pem -rw------- 1 lijian lijian 1675 4月 14 17:04 ca-key.pem
创建 Kubernetes CA 证书
lijian@vma:/opt/k8s/kubernetes-cert$ cat ca-config.json { "signing": { "default": { "expiry": "87600h" }, "profiles": { "kubernetes": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } }
lijian@vma:/opt/k8s/kubernetes-cert$ cat ca-csr.json { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "suzhou", "ST": "suzhou", "O": "k8s", "OU": "System" } ] }
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
生成API_SERVER证书
lijian@vma:/opt/k8s/kubernetes-cert$ cat server-csr.json { "CN": "kubernetes", "hosts": [ "10.0.0.1", "127.0.0.1", "192.168.17.143", "192.168.17.144", "192.168.17.145", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "suzhou", "ST": "suzhou", "O": "k8s", "OU": "System" } ] }
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
创建 Kubernetes Proxy 证书
lijian@vma:/opt/k8s/kubernetes-cert$ cat kube-proxy-csr.json { "CN": "system:kube-proxy", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "suzhou", "ST": "suzhou", "O": "k8s", "OU": "System" } ] }
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
把k8s的证书拷贝到/k8s/kubernetes/ssl/目录下
lijian@vma:/opt/k8s/kubernetes-cert$ ls -lrt /k8s/kubernetes/ssl/ total 64 -rw-rw-r-- 1 lijian lijian 1354 4月 15 16:19 ca.pem -rw------- 1 lijian lijian 1675 4月 15 16:19 ca-key.pem -rw-rw-r-- 1 lijian lijian 1623 4月 18 14:11 server.pem -rw------- 1 lijian lijian 1675 4月 18 14:11 server-key.pem -rw-rw-r-- 1 lijian lijian 1395 4月 18 17:20 kube-proxy.pem -rw------- 1 lijian lijian 1679 4月 18 17:20 kube-proxy-key.pem
tips: 查看systemctl 服务日志:
journalctl -u {servicename} [-xe -f]
2.部署etcd
https://github.com/etcd-io/etcd/releases/download/v3.5.3/etcd-v3.5.3-linux-amd64.tar.gz
解压安装文件
tar -xvf etcd-v3.5.3-linux-amd64.tar.gz
cd etcd-v3.5.3-linux-amd64/
cp etcd etcdctl /k8s/etcd/bin/
cp etcdctl /usr/local/bin
配置文件
lijian@vma:/opt/k8s/etcd-v3.5.3-linux-amd64$ cat /k8s/etcd/cfg/etcd
#[Member]
ETCD_NAME="etcd01"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.17.143:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.17.143:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.17.143:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.17.143:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://192.168.17.143:2380,etcd02=https://192.168.17.144:2380,etcd03=https://192.168.17.145:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
创建 etcd的 systemd unit 文件
lijian@vma:/opt/k8s/etcd-v3.5.3-linux-amd64$ cat /usr/lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/k8s/etcd/cfg/etcd
ExecStart=/k8s/etcd/bin/etcd --enable-v2 \
--cert-file=/k8s/etcd/ssl/server.pem \
--key-file=/k8s/etcd/ssl/server-key.pem \
--peer-cert-file=/k8s/etcd/ssl/server.pem \
--peer-key-file=/k8s/etcd/ssl/server-key.pem \
--trusted-ca-file=/k8s/etcd/ssl/ca.pem \
--peer-trusted-ca-file=/k8s/etcd/ssl/ca.pem
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
启动ETCD服务
systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl sttus etcd
将启动文件与配置文件拷贝到节点1、节点2
scp -r /k8s/etcd/ ......
scp /usr/lib/systemd/system/etcd.service ......
修改节点1的配置文件:
lijian@k8s-node1:~$ cat /k8s/etcd/cfg/etcd
#[Member]
ETCD_NAME="etcd02"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.17.144:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.17.144:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.17.144:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.17.144:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://192.168.17.143:2380,etcd02=https://192.168.17.144:2380,etcd03=https://192.168.17.145:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
修改节点2的配置文件:
lijian@k8s-node2:~$ cat /k8s/etcd/cfg/etcd
#[Member]
ETCD_NAME="etcd03"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.17.145:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.17.145:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.17.145:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.17.145:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://192.168.17.143:2380,etcd02=https://192.168.17.144:2380,etcd03=https://192.168.17.145:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
启动节点1和节点2的etcd
验证集群是否正常运行
验证集群状态
etcdctl --cacert=/k8s/etcd/ssl/ca.pem --cert=/k8s/etcd/ssl/server.pem --key=/k8s/etcd/ssl/server-key.pem --endpoints="https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379" endpoint health
etcdctl --cacert=/k8s/etcd/ssl/ca.pem --cert=/k8s/etcd/ssl/server.pem --key=/k8s/etcd/ssl/server-key.pem --endpoints="https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379" endpoint status -w table
3.部署flannel
下载 https://github.com/flannel-io/flannel/releases
向 etcd 写入集群 Pod 网段信息
- flanneld 当前版本 不支持 etcd v3,故使用 etcd v2 API 写入配置 key 和网段数据;用v3版本写入会报错:Couldn’t fetch network config: 100: Key not found (/coreos.com) [11]
- 写入的 Pod 网段 ${CLUSTER_CIDR} 必须是 /16 段地址,必须与 kube-controller-manager 的 –cluster-cidr 参数值一致;
1.开启 etcd v2 API 接口,按照这个手册配置的话前面已经开启了
在 etcd 启动命令里面添加上如下内容,然后重启 etcd 集群
--enable-v2
2.使用 etcd v2 去创建 flannel 的网络配置
ETCDCTL_API=2 etcdctl --ca-file=/k8s/etcd/ssl/ca.pem --cert-file=/k8s/etcd/ssl/server.pem --key-file=/k8s/etcd/ssl/server-key.pem --endpoints="https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379" set /coreos.com/network/config '{ "Network": "10.244.0.0/16", "Backend": {"Type": "vxlan"}}'
解压安装
tar -xvf flannel-v0.17.0-linux-amd64.tar.gz
mv flanneld mk-docker-opts.sh /k8s/kubernetes/bin/
配置Flannel
lijian@vma:/opt/k8s$ cat /k8s/kubernetes/cfg/flanneld
FLANNEL_OPTIONS="--etcd-endpoints=https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379 -etcd-cafile=/k8s/etcd/ssl/ca.pem -etcd-certfile=/k8s/etcd/ssl/server.pem -etcd-keyfile=/k8s/etcd/ssl/server-key.pem -iface=ens33"
- flanneld 使用系统缺省路由所在的接口与其它节点通信,对于有多个网络接口(如内网和公网)的节点,可以用 -iface 参数指定通信接口,如上面的 ens33 接口;
创建 flanneld 的 systemd unit 文件
lijian@vma:/opt/k8s$ cat /usr/lib/systemd/system/flanneld.service
[Unit]
Description=Flanneld overlay address etcd agent
After=network-online.target network.target
Before=docker.service
[Service]
Type=notify
EnvironmentFile=/k8s/kubernetes/cfg/flanneld
ExecStart=/k8s/kubernetes/bin/flanneld --ip-masq $FLANNEL_OPTIONS
ExecStartPost=/k8s/kubernetes/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure
[Install]
WantedBy=multi-user.target
- mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网网段信息写入 /run/flannel/docker 文件,并且会生成/run/flannel/subnet.env文件,后续 docker 启动时 使用/run/flannel/dockerv文件中的环境变量配置 docker0 网桥;
启动flannel服务
systemctl daemon-reload
systemctl enable flanneld
systemctl start flanneld
systemctl sttus flanneld
配置Docker启动指定子网段,注意路径是/lib/systemd/system/docker.service,增加EnvironmentFile配置,修改ExecStart添加$DOCKER_NETWORK_OPTIONS参数
lijian@vma:/opt/k8s$ cat /lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target docker.socket firewalld.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
EnvironmentFile=/run/flannel/docker
ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS -H fd://
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
重启docker服务
systemctl daemon-reload
systemctl restart docker
查看flannel和docker0的网段是否一致,一致就可以了
lijian@vma:/opt/k8s$ ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000
link/ether 00:0c:29:ea:4e:08 brd ff:ff:ff:ff:ff:ff
inet 192.168.17.143/24 brd 192.168.17.255 scope global dynamic noprefixroute ens33
valid_lft 1413sec preferred_lft 1413sec
inet6 fe80::679:d74e:a380:8475/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether 26:5f:27:9a:9e:f5 brd ff:ff:ff:ff:ff:ff
inet 10.244.62.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::245f:27ff:fe9a:9ef5/64 scope link
valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:45:72:92:ac brd ff:ff:ff:ff:ff:ff
inet 10.244.62.1/24 brd 10.244.62.255 scope global docker0
valid_lft forever preferred_lft forever
将flanneld systemd unit 文件拷贝到所有节点
scp -r /k8s/kubernetes ......
scp /usr/lib/systemd/system/flanneld.service ......
scp /lib/systemd/system/docker.service ......
在所有节点上启动flannel和重启docker,检查网段是否一致。
问题:docker0和flannel.1网段不一致,docker0配置无效
解决:docker0的启动文件在/lib/systemd/system/docker.service,前面没有/usr,之前搞错了
卸载flannled和docker0:
sudo systemctl stop flanneld
ip addr s flannel.1
sudo ifconfig flannel.1 down
ip addr s flannel.1
sudo ip link del flannel.1
ip addr s flannel.1
卸载docker0:
brctl delbr docker0
如果使用此种方式删除flannel产生的vxlan设备,会得到如下提示:
[root@host131 ~]# brctl delbr flannel.1
can't delete bridge flannel.1: Operation not permitted
如果没有相关命令,安装即可
重置flannel的话注意记得删除etcd存储的信息:/coreos.com/network/subnets 里的所有内容
ETCDCTL_API=2 etcdctl --ca-file=/k8s/etcd/ssl/ca.pem --cert-file=/k8s/etcd/ssl/server.pem --key-file=/k8s/etcd/ssl/server-key.pem --endpoints="https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379" ls /coreos.com/network/subnets
4.部署master
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#server-binaries
kubernetes master 节点运行如下组件:
-
kube-apiserver
-
kube-scheduler
-
kube-controller-manager
kube-scheduler 和 kube-controller-manager 可以以集群模式运行,通过 leader 选举产生一个工作进程,其它进程处于阻塞模式。
部署kube-apiserver
将二进制文件解压拷贝到master 节点
tar -xvf kubernetes-server-linux-amd64.tar.gz
cd kubernetes/server/bin/
cp kube-scheduler kube-apiserver kube-controller-manager kubectl /k8s/kubernetes/bin/
拷贝认证
cp /opt/k8s/kubernetes-cert/*pem /k8s/kubernetes/ssl/
创建 TLS Bootstrapping Token
# head -c 16 /dev/urandom | od -An -t x | tr -d ' '
2366a641f656a0a025abb4aabda4511b
vim /k8s/kubernetes/cfg/token.csv
2366a641f656a0a025abb4aabda4511b,kubelet-bootstrap,10001,"system:kubelet-bootstrap"
创建apiserver配置文件
lijian@vma:/opt/k8s/kubernetes-cert$ cat /k8s/kubernetes/cfg/kube-apiserver
KUBE_APISERVER_OPTS="--logtostderr=true \
--v=4 \
--etcd-servers=https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379 \
--bind-address=192.168.17.143 \
--insecure-bind-address=0.0.0.0 \
--secure-port=6443 \
--advertise-address=192.168.17.143 \
--allow-privileged=true \
--service-cluster-ip-range=10.0.0.0/24 \
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \
--authorization-mode=Node,RBAC \
--enable-bootstrap-token-auth \
--token-auth-file=/k8s/kubernetes/cfg/token.csv \
--service-node-port-range=30000-50000 \
--tls-cert-file=/k8s/kubernetes/ssl/server.pem \
--tls-private-key-file=/k8s/kubernetes/ssl/server-key.pem \
--client-ca-file=/k8s/kubernetes/ssl/ca.pem \
--service-account-key-file=/k8s/kubernetes/ssl/ca.pub \
--service-account-signing-key-file=/k8s/kubernetes/ssl/ca-key.pem \
--service-account-issuer=api \
--etcd-cafile=/k8s/etcd/ssl/ca.pem \
--etcd-certfile=/k8s/etcd/ssl/server.pem \
--etcd-keyfile=/k8s/etcd/ssl/server-key.pem"
创建 kube-apiserver systemd unit 文件
lijian@vma:/opt/k8s/kubernetes-cert$ cat /usr/lib/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kube-apiserver
ExecStart=/k8s/kubernetes/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
启动服务
systemctl daemon-reload
systemctl enable kube-apiserver
systemctl restart kube-apiserver
systemctl status kube-apiserver
查看apiserver是否运行
ps -ef |grep kube-apiserver
报错:Error: [service-account-issuer is a required flag
解决:生成ca.pub,修改apiserver的参数
openssl x509 -in ca.pem -pubkey -noout > ca.pub
--service-account-key-file=/k8s/kubernetes/ssl/ca.pub \
--service-account-signing-key-file=/k8s/kubernetes/ssl/ca-key.pem \
--service-account-issuer=api \
部署kube-scheduler
创建证书
lijian@vma:~$ cat /opt/k8s/kubernetes-cert/kube-scheduler-csr.json
{
"CN": "system:kube-scheduler",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "suzhou",
"ST": "suzhou",
"O": "system:masters",
"OU": "System"
}
]
}
生成证书,并拷贝到 /k8s/kubernetes/ssl/
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
-rw-rw-r-- 1 lijian lijian 1415 4月 18 10:46 kube-scheduler.pem
-rw------- 1 lijian lijian 1675 4月 18 10:46 kube-scheduler-key.pem
生成kubeconfig文件,执行kube-scheduler_env.sh
lijian@vma:~$ cat /k8s/kubernetes/cfg/kube-scheduler_env.sh
KUBE_CONFIG="/k8s/kubernetes/cfg/kube-scheduler.kubeconfig"
KUBE_APISERVER="https://192.168.17.143:6443"
kubectl config set-cluster kubernetes \
--certificate-authority=/k8s/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-scheduler \
--client-certificate=/k8s/kubernetes/ssl/kube-scheduler.pem \
--client-key=/k8s/kubernetes/ssl/kube-scheduler-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-scheduler \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
创建conf配置文件
lijian@vma:~$ cat /k8s/kubernetes/cfg/kube-scheduler
KUBE_SCHEDULER_OPTS="--logtostderr=true --v=4 \
--kubeconfig=/k8s/kubernetes/cfg/kube-scheduler.kubeconfig \
--bind-address=127.0.0.1 \
--leader-elect"
- –kubeconfig:连接apiserver配置文件
- –leader-elect:当该组件启动多个时,自动选举(HA)
创建systemd启动文件
lijian@vma:~$ cat /usr/lib/systemd/system/kube-scheduler.service
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kube-scheduler
ExecStart=/k8s/kubernetes/bin/kube-scheduler $KUBE_SCHEDULER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
启动kube-scheduler
systemctl daemon-reload
systemctl restart kube-scheduler
systemctl enable kube-scheduler
查看kube-scheduler是否运行
ps -ef |grep kube-scheduler
部署kube-controller-manager
生成证书
lijian@vma:~$ cat /opt/k8s/kubernetes-cert/kube-controller-manager-csr.json
{
"CN": "system:kube-controller-manager",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "suzhou",
"ST": "suzhou",
"O": "system:masters",
"OU": "System"
}
]
}
# 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
# 拷贝到/k8s/kubernetes/ssl/
-rw-rw-r-- 1 lijian lijian 1432 4月 18 10:24 kube-controller-manager.pem
-rw------- 1 lijian lijian 1675 4月 18 10:24 kube-controller-manager-key.pem
生成kubeconfig文件,执行kube-controller_env.sh
lijian@vma:~$ cat /k8s/kubernetes/cfg/kube-controller_env.sh
KUBE_CONFIG="/k8s/kubernetes/cfg/kube-controller-manager.kubeconfig"
KUBE_APISERVER="https://192.168.17.143:6443"
kubectl config set-cluster kubernetes \
--certificate-authority=/k8s/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-controller-manager \
--client-certificate=/k8s/kubernetes/ssl/kube-controller-manager.pem \
--client-key=/k8s/kubernetes/ssl/kube-controller-manager-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-controller-manager \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
创建conf配置文件
lijian@vma:~$ cat /k8s/kubernetes/cfg/kube-controller-manager
KUBE_CONTROLLER_MANAGER_OPTS="--logtostderr=true \
--v=4 \
--leader-elect=true \
--kubeconfig=//k8s/kubernetes/cfg/kube-controller-manager.kubeconfig \
--bind-address=127.0.0.1 \
--service-cluster-ip-range=10.0.0.0/24 \
--allocate-node-cidrs=true \
--cluster-name=kubernetes \
--cluster-cidr=10.244.0.0/16 \
--service-cluster-ip-range=10.0.0.0/24 \
--cluster-signing-cert-file=/k8s/kubernetes/ssl/ca.pem \
--cluster-signing-key-file=/k8s/kubernetes/ssl/ca-key.pem \
--root-ca-file=/k8s/kubernetes/ssl/ca.pem \
--service-account-private-key-file=/k8s/kubernetes/ssl/ca-key.pem \
--cluster-signing-duration=87600h0m0s"
- –kubeconfig:连接apiserver配置文件
- –leader-elect:当该组件启动多个时,自动选举(HA)
- –cluster-signing-cert-file/–cluster-signing-key-file:自动为kubelet颁发证书的CA,与apiserver保持一致
创建systemd启动文件
lijian@vma:~$ cat /usr/lib/systemd/system/kube-controller-manager.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kube-controller-manager
ExecStart=/k8s/kubernetes/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
启动kube-controller-manager
systemctl daemon-reload
systemctl restart kube-controller-manager
systemctl enable kube-controller-manager
systemctl status kube-controller-manager
将可执行文件路/k8s/kubernetes/ 添加到 PATH 变量中
vim /etc/profile
PATH=/k8s/kubernetes/bin:$PATH:$HOME/bin
source /etc/profile
制作kubectl 命令行工具
1.cat >> admin-csr.json << EOF
lijian@vma:/opt/k8s/kubernetes-cert$ cat admin-csr.json
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "suzhou",
"L": "suzhou",
"O": "system:masters",
"OU": "System"
}
]
}
2.制作证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
3.设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/k8s/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=https://192.168.17.143:6443
4.设置客户端认证参数
kubectl config set-credentials admin \
--client-certificate=/k8s/kubernetes/ssl/admin.pem \
--embed-certs=true \
--client-key=/k8s/kubernetes/ssl/admin-key.pem
5.设置上下文参数
kubectl config set-context kubernetes \
--cluster=kubernetes \
--user=admin
6.设置默认上下文
kubectl config use-context kubernetes
kubectl get cs
查看master集群状态
lijian@vma:~$ kubectl get cs,nodes
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
componentstatus/etcd-1 Healthy {"health":"true","reason":""}
componentstatus/etcd-0 Healthy {"health":"true","reason":""}
componentstatus/etcd-2 Healthy {"health":"true","reason":""}
componentstatus/scheduler Healthy ok
componentstatus/controller-manager Healthy ok
NAME STATUS ROLES AGE VERSION
node/192.168.17.143 Ready,SchedulingDisabled master 4d5h v1.23.5
node/192.168.17.144 Ready node 3d22h v1.23.5
node/192.168.17.145 Ready node 3d22h v1.23.5
5.部署node节点
kubernetes work 节点运行如下组件:
- docker 前面已经部署
- kubelet
- kube-proxy
将kubelet和kube-proxy 二进制文件拷贝node节点
scp /k8s/kubernetes/bin/kubelet /k8s/kubernetes/bin/kube-proxy ...
部署kubelet
创建yml参数配置文件
lijian@vma:~$ cat /k8s/kubernetes/cfg/kubelet-config.yml
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 192.168.17.143
port: 10250
readOnlyPort: 10255
cgroupDriver: systemd
clusterDNS: ["10.0.0.2"]
clusterDomain: cluster.local.
failSwapOn: false
authentication:
anonymous:
enabled: true
创建bootstrap.kubeconfig配置文件
- kubelet初次加入集群引导kubeconfig文件
创建bootstrap.kubeconfig配置文件,执行如下脚本
lijian@vma:~$ cat /k8s/kubernetes/cfg/kubelet_env.sh
KUBE_CONFIG="/k8s/kubernetes/cfg/bootstrap.kubeconfig"
KUBE_APISERVER="https://192.168.17.143:6443"
TOKEN=`cat /k8s/kubernetes/cfg/token.csv|awk -F',' '{print $1}'` # 与token.csv里保持一致
# 生成 kubelet bootstrap kubeconfig 配置文件
kubectl config set-cluster kubernetes \
--certificate-authority=/k8s/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials "kubelet-bootstrap" \
--token=${TOKEN} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user="kubelet-bootstrap" \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
创建conf配置文件
lijian@vma:~$ cat /k8s/kubernetes/cfg/kubelet.config
KUBELET_OPTS="--logtostderr=true \
--v=4 \
--log-dir=/opt/k8s/logs \
--hostname-override=192.168.17.143 \
--kubeconfig=/k8s/kubernetes/cfg/kubelet.kubeconfig \
--bootstrap-kubeconfig=/k8s/kubernetes/cfg/bootstrap.kubeconfig \
--config=/k8s/kubernetes/cfg/kubelet-config.yml \
--cert-dir=/k8s/kubernetes/ssl \
--cgroup-driver=systemd \
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"
- –hostname-override:显示名称,为节点hostname, 集群中唯一
- –kubeconfig:会自动生成,后面用于连接apiserver
- –bootstrap-kubeconfig:首次启动向apiserver申请证书
- –config:配置参数文件
- –cert-dir:kubelet证书生成目录
- –pod-infra-container-image:管理Pod网络容器的镜像
创建systemd启动文件
lijian@vma:~$ cat /usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
After=docker.service
Requires=docker.service
[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kubelet.config
ExecStart=/k8s/kubernetes/bin/kubelet $KUBELET_OPTS
Restart=on-failure
KillMode=process
[Install]
WantedBy=multi-user.target
启动kubelet
systemctl daemon-reload
systemctl restart kubelet
systemctl enable kubelet
systemctl status kubelet
启动成功后,会在/k8s/kubernetes/ssl/下自动生成几个证书文件
-rw------- 1 root root 1224 4月 18 10:39 kubelet-client-2022-04-18-10-39-24.pem
lrwxrwxrwx 1 root root 58 4月 18 10:39 kubelet-client-current.pem -> /k8s/kubernetes/ssl/kubelet-client-2022-04-18-10-39-24.pem
-rw-r--r-- 1 root root 2279 4月 16 21:30 kubelet.crt
-rw------- 1 root root 1675 4月 16 21:30 kubelet.key
/k8s/kubernetes/cfg下自动生成kubelet.kubeconfig文件
-rw------- 1 root root 2293 4月 18 10:39 kubelet.kubeconfig
同步kubelet配置到其余节点
同步kubelet.conf, kubelet-config.yml, bootstrap.kubeconfig, kubelet.service到所有节点, 修改kubelet.conf中hostname-override参数为对应节点的hostname
节点1的配置文件
lijian@k8s-node1:~$ cat /k8s/kubernetes/cfg/kubelet.config
KUBELET_OPTS="--logtostderr=true \
--v=4 \
--log-dir=/opt/k8s/logs \
--hostname-override=192.168.17.144 \
--kubeconfig=/k8s/kubernetes/cfg/kubelet.kubeconfig \
--bootstrap-kubeconfig=/k8s/kubernetes/cfg/bootstrap.kubeconfig \
--config=/k8s/kubernetes/cfg/kubelet-config.yml \
--cert-dir=/k8s/kubernetes/ssl \
--cgroup-driver=systemd \
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"
节点2的配置文件:
lijian@k8s-node2:/k8s/etcd/bin$ cat /k8s/kubernetes/cfg/kubelet.config
KUBELET_OPTS="--logtostderr=true \
--v=4 \
--log-dir=/opt/k8s/logs \
--hostname-override=192.168.17.145 \
--kubeconfig=/k8s/kubernetes/cfg/kubelet.kubeconfig \
--bootstrap-kubeconfig=/k8s/kubernetes/cfg/bootstrap.kubeconfig \
--config=/k8s/kubernetes/cfg/kubelet-config.yml \
--cert-dir=/k8s/kubernetes/ssl \
--cgroup-driver=systemd \
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"
其余节点启动kubelet
如果启动kubelet时候报错,cannot create certificate signing request:
需要在master上操作,授权kubelet-bootstrap用户允许请求证书
lijian@vma:/k8s/kubernetes/cfg$ cat kubelet-bootstrap-rabc.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: create-csrs-for-bootstrapping
subjects:
- kind: Group
name: system:bootstrappers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:node-bootstrapper
apiGroup: rbac.authorization.k8s.io
# 执行
kubectl apply -f /opt/k8s/yaml/kubelet-bootstrap-rbac.yaml
重新启动kubelet,然后在master上查看证书申请, Pending状态
kubectl get csr
批准kubelet证书申请并加入集群
for csr in `kubectl get csr |awk 'NR>1 {print $1}'`;do kubectl certificate approve $csr;done
再次查看证书申请,应该是 Approved,Issued
查看节点状态
lijian@vma:/k8s/kubernetes/cfg$ kubectl get node
NAME STATUS ROLES AGE VERSION
192.168.17.143 Ready,SchedulingDisabled master 4d5h v1.23.5
192.168.17.144 Ready node 3d23h v1.23.5
192.168.17.145 Ready node 3d23h v1.23.5
问题:controller-manager和scheduler 报错:连接不到127.0.0.1:8080,导致csr一直没有被issued,kubectl get ndoes差不到节点。
原因:没有为他俩认证,
解决:用kube-controller-manager-csr.json和kube-scheduler-csr.json分别生成证书,使用kubectl生成kube-scheduler.kubeconfig和kube-controller-manager.kubeconfig,并且配置到服务中。
部署kube-proxy
创建配置文件kube-proxy-config.yml
lijian@vma:/k8s/kubernetes/cfg$ cat kube-proxy-config.yml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
iptables:
masqueradeAll: true
masqueradeBit: null
minSyncPeriod: 0s
syncPeriod: 0s
ipvs:
masqueradeAll: true
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: "rr"
strictARP: false
syncPeriod: 0s
tcpFinTimeout: 0s
tcpTimeout: 0s
udpTimeout: 0s
mode: "ipvs"
clientConnection:
kubeconfig: /k8s/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 192.168.17.143
clusterCIDR: 10.244.0.0/16
- bindAddress: 监听地址;
- clientConnection.kubeconfig: 连接 apiserver 的 kubeconfig 文件;
- clusterCIDR: kube-proxy 根据 –cluster-cidr 判断集群内部和外部流量,指定 –cluster-cidr 或 –masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
- hostnameOverride: 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 ipvs 规则;
- mode: 使用 ipvs 模式;
生成kube-proxy.kubeconfig文件,执行kube-proxy_env.sh,kube-proxy证书在开头创建过了
lijian@vma:/k8s/kubernetes/cfg$ cat kube-proxy_env.sh
KUBE_CONFIG="/k8s/kubernetes/cfg/kube-proxy.kubeconfig"
KUBE_APISERVER="https://192.168.17.143:6443"
TOKEN=`cat /k8s/kubernetes/cfg/token.csv|awk -F',' '{print $1}'` # 与token.csv里保持一致
# 生成 kubelet bootstrap kubeconfig 配置文件
kubectl config set-cluster kubernetes \
--certificate-authority=/k8s/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials "kube-proxy" \
--client-certificate=/k8s/kubernetes/ssl/kube-proxy.pem \
--client-key=/k8s/kubernetes/ssl/kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user="kube-proxy" \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
创建conf配置文件
lijian@vma:/k8s/kubernetes/cfg$ cat kube-proxy.conf
KUBE_PROXY_OPTS="--logtostderr=true \
--v=4 \
--log-dir=/opt/k8s/logs \
--config=/k8s/kubernetes/cfg/kube-proxy-config.yml"
创建systemd启动文件
lijian@vma:/k8s/kubernetes/cfg$ cat /usr/lib/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Proxy
After=network.target
[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kube-proxy.conf
ExecStart=/k8s/kubernetes/bin/kube-proxy $KUBE_PROXY_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
启动kube-proxy
systemctl daemon-reload
systemctl restart kube-proxy
systemctl enable kube-proxy
同步kube-proxy配置到其余节点
同步kube-proxy.conf, kube-proxy-config.yml, kube-proxy.kubeconfig, kube-proxy.service到所有节点, 修改kube-proxy-config.yml配置文件中hostnameOverride参数为对应节点的hostname
节点1的配置
lijian@k8s-node1:~$ cat /k8s/kubernetes/cfg/kube-proxy-config.yml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
iptables:
masqueradeAll: true
masqueradeBit: null
minSyncPeriod: 0s
syncPeriod: 0s
ipvs:
masqueradeAll: true
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: "rr"
strictARP: false
syncPeriod: 0s
tcpFinTimeout: 0s
tcpTimeout: 0s
udpTimeout: 0s
mode: "ipvs"
clientConnection:
kubeconfig: /k8s/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 192.168.17.144
clusterCIDR: 10.244.0.0/16
节点2的配置
lijian@k8s-node2:/k8s/etcd/bin$ cat /k8s/kubernetes/cfg/kube-proxy-config.yml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
iptables:
masqueradeAll: true
masqueradeBit: null
minSyncPeriod: 0s
syncPeriod: 0s
ipvs:
masqueradeAll: true
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: "rr"
strictARP: false
syncPeriod: 0s
tcpFinTimeout: 0s
tcpTimeout: 0s
udpTimeout: 0s
mode: "ipvs"
clientConnection:
kubeconfig: /k8s/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 192.168.17.145
clusterCIDR: 10.244.0.0/16
其余节点启动kubelet
打node 或者master 节点的标签
kubectl label node 192.168.17.143 node-role.kubernetes.io/master='master'
kubectl label node 192.168.17.144 node-role.kubernetes.io/node='node'
kubectl label node 192.168.17.145 node-role.kubernetes.io/node='node'
lijian@vma:/k8s/kubernetes/cfg$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.17.143 Ready,SchedulingDisabled master 4d5h v1.23.5
192.168.17.144 Ready node 3d23h v1.23.5
192.168.17.145 Ready node 3d23h v1.23.5
6.部署coredns,(可以使pod内访问外网)
git clone https://github.com/coredns/deployment.git
cd /home/yaml/coredns/deployment/kubernetes/
#-i 指定集群dns,预先设置的可以查看 kubelet-config.yml
./deploy.sh -i 10.0.0.2 > coredns.yaml
#创建执行
kubectl apply -f coredns.yaml
问题:
[FATAL] plugin/loop: Loop (127.0.0.1:33855 -> :53) detected for zone ".",
解决:
如果出现 [FATAL] plugin/loop: Loop (127.0.0.1:55751 -> :53) detected for zone "."
表示服务器的/etc/resolv.conf 的dns 设置 有包含 127.0.* 的字段,它识别失败
修改dns,在重启coredns服务
#修改节点配置文件
cat /etc/resolv.conf
nameserver 114.114.114.114
删除pod,会重新启动
问题:节点Kubelet启动报错: bind: cannot assign requested address
解决:配置错误,kubelet.config中的–hostname-override改为节点IP,而不是master IP, kubelet-config.yml中的address: 0.0.0.0
另外sever-csr.json中的hosts加入各节点的IP,不知是否是根本原因
删除生成的kublet的证书 和kubelet.kubeconfig,重新启动kubelet.service
问题:Forbidden (user=system:anonymous, verb=get, resource=nodes, subresource=prox
解决: kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous
问题: 部署kuboard时, kuboard-agent 处于CrashLoopBackOff状态,报错Could not resolve host: kuboard-v3
原因:flannel和docker0网段不一致,解决:flannel的Network设置为10.244.0.0/16,controller-manager和kube-proxy的–cluster-cidr=10.244.0.0/16和 clusterCIDR也设为同样的值
问题: Error creating: pods “metrics-server-86597c44b6-79prc” is forbidden: SecurityContext.RunAsUser is forbidden
原因:准入控制插件限制
解决: 在kube-apiserver的配置文件中删除–enable-admission-plugins中的SecurityContextDeny,重启api-server即可。
7.安装kuboard(同kubeadm部署k8s部分)
省略
TIPS:
扩展虚拟机硬盘:安装gparted,通过图形界面操作resize
sudo apt-get install gparted
虚拟机系统文件损坏:
1、重启ubuntu,随即长按shirft进入grub菜单,或等待grub菜单的出现
2、选择recovery mode,接着用方向键将光标移至recovery mode,按"e"键进入编辑页面
3、将 ro recovery nomodeset 改为 rw single init=/bin/bash,在linux /boot 。。。那一行
4、 按 ctrl+x或者F10 进入单用户模式,当前用户即为root。这时候可以修改文件。修改完毕后重启即可。
附:源码编译方法
避免每次都要输入sudo,可以设置用户,注意执行后必须重启后登录
sudo usermod -a -G docker ${USER}
安装cfssl
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
chmod +x cfssl_linux-amd64
cp cfssl_linux-amd64 /usr/local/bin/cfssl
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x cfssljson_linux-amd64
cp cfssljson_linux-amd64 /usr/local/bin/cfssljson
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl-certinfo_linux-amd64
cp cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo
编译脚本:
# build kubelet
KUBE_BUILD_PLATFORMS=linux/amd64 make all WHAT=cmd/kubelet GOFLAGS=-v GOGCFLAGS="-N -l"
# build all
KUBE_BUILD_PLATFORMS=linux/amd64 make all GOFLAGS=-v GOGCFLAGS="-N -l"
# build api-server
KUBE_BUILD_PLATFORMS=linux/amd64 make all WHAT=cmd/kube-apiserver GOFLAGS=-v GOGCFLAGS="-N -l"
运行脚本:
lijian@vma:/app/web/gopath/src/k8s.io/kubernetes-1.23.5$ cat run.sh
# 1. install etcd
export PATH="/app/web/gopath/src/k8s.io/kubernetes-1.23.5/third_party/etcd:${PATH}"
# 2. run k8s locally (skip compile)
./hack/local-up-cluster.sh -O
# 3. compile and run k8s locally
PATH=$PATH KUBERNETES_PROVIDER=local hack/local-up-cluster.sh
# 4. get k8s nodes
# ./cluster/kubectl.sh get nodes
更多推荐
所有评论(0)