1.kubeadm部署

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

虚拟机环境:

机器IPRole系统
192.168.17.140masterUbuntu 18.04.5 LTS amd64
192.168.17.141nodeUbuntu 18.04.5 LTS amd64
192.168.17.142nodeUbuntu 18.04.5 LTS amd64

版本:

​ k8s: v1.23.5

​ docker: 20.10.14

​ kuboard: v3

安装步骤:

  1. 安装虚拟机

    新建一个ubuntu虚拟机,安装好后复制两份虚拟机文件。打开复制的虚拟机时,根据提示选择“我已经复制虚拟机”,vm会自动为复制的虚拟机生成新的ip和mac,无需手动配置。

  2. 准备工作(在所有节点上操作)

    修改主机名:

    永久修改主机名,修改 /etc/hostname,重启生效

    sudo vi /etc/hostname
    

    启动后安装ssh server

    sudo apt-get install openssh-server
    
    sudo service sshd status
    

    安装docker:https://docs.docker.com/engine/install/ubuntu/

    配置docker开机自启:

    sudo systemctl enable docker
    

    修改docker配置,cgroup dirvers为systemd

    sudo tee /etc/docker/daemon.json <<-'EOF'
    {
    "registry-mirrors": ["https://vmjo8s1n.mirror.aliyuncs.com"],
    "exec-opts": ["native.cgroupdriver=systemd"],
    "log-driver": "json-file",
    "log-opts": {
    
    "max-size": "100m"
    },
    "storage-driver": "overlay2"
    }
    EOF
    sudo systemctl daemon-reload
    sudo systemctl restart docker
    

    通过命令sudo docker info,查看Cgroup Driver: systemd是否正确。

    配置sudo免密

    sudo visudo
    

    末尾添加 用户名 ALL=NOPASSWD:ALL

    lijian ALL=NOPASSWD:ALL
    

    swap禁用:

    sudo vi /etc/fstab 
    

    注释掉 /swapfile none swap sw 0 0,

    重启系统, 验证是否关闭,没有输出就是已关闭

    sudo swapon --show
    

    配置br_netfilter模块

    sudo modprobe br_netfilter
    lsmod | grep br_netfilter
    
    cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
    br_netfilter
    EOF
    
    cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    EOF
    sudo sysctl --system
    

    安装其他模块:

    sudo apt install ebtables ethtool
    sudo apt-get install socat
    sudo apt-get install conntrack
    
  3. 安装部署工具kubeadm、Kubelet、kubectl(在所有节点上操作)

    • kubeadm 安装在master上,用来部署和启动集群
    • kubelet 部署在所有节点上,用来维护节点
    • kubectl 是操作集群的命令行工具,一般安装在master上就够用了,

    安装CNI插件,CNI插件用于pod网络通信

    CNI_VERSION="v0.8.2"
    ARCH="amd64"
    sudo mkdir -p /opt/cni/bin
    curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-linux-${ARCH}-${CNI_VERSION}.tgz" | sudo tar -C /opt/cni/bin -xz
    

    安装crictl, 用于Kubeadm和kubelet操作容器运行接口

    DOWNLOAD_DIR=/usr/local/bin
    sudo mkdir -p $DOWNLOAD_DIR
    CRICTL_VERSION="v1.22.0"
    ARCH="amd64"
    curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-${ARCH}.tar.gz" | sudo tar -C $DOWNLOAD_DIR -xz
    

    安装kubeadm,kubelet,kubectl并将kubelet添加到系统服务

    RELEASE="$(curl -sSL https://dl.k8s.io/release/stable.txt)"
    ARCH="amd64"
    cd $DOWNLOAD_DIR
    sudo curl -L --remote-name-all https://storage.googleapis.com/kubernetes-release/release/${RELEASE}/bin/linux/${ARCH}/{kubeadm,kubelet,kubectl}
    sudo chmod +x {kubeadm,kubelet,kubectl}
    
    RELEASE_VERSION="v0.4.0"
    curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service
    sudo mkdir -p /etc/systemd/system/kubelet.service.d
    curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
    

    启动kubelet

    systemctl enable --now kubelet
    

    此时kubelet每隔几秒会重新启动,处于crashloop状态 ,等待kubeadm通知它怎么做,这是正常的。

  4. 部署master(master节点操作)

    为了加快初始化速度以及墙的缘故,提前在本地准备好镜像,查看所需要的镜像

    kubeadm config images list
    

    使用阿里云镜像仓库下载,执行images_pull.sh,根据实际部署版本修改脚本内容,镜像需要在所有节点上都准备好

    lijian@vm2:~$ cat images_pull.sh 
    #! /bin/bash
    images=( 
        kube-apiserver:v1.23.5
        kube-controller-manager:v1.23.5
        kube-scheduler:v1.23.5
        kube-proxy:v1.23.5
        pause:3.6
        etcd:3.5.1-0
        coredns:v1.8.6
    )
    
    for imageName in ${images[@]} ; do
        docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
        docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName k8s.gcr.io/$imageName
        docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
    done
    docker tag k8s.gcr.io/coredns:v1.8.6 k8s.gcr.io/coredns/coredns:v1.8.6
    

    初始化master节点:

    kubeadm init
    

    kubeadm init会先预检环境是否满足需求,如果不满足需求会退出初始化,需要先解决报错问题,再重新运行init。重新运行kubeadm init之前先清理环境:

    kubeadm reset
    

    初始化成功后,显示如下信息:

    Your Kubernetes control-plane has initialized successfully!
    
    To start using your cluster, you need to run the following as a regular user:
    
      mkdir -p $HOME/.kube
      sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
      sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    You should now deploy a Pod network to the cluster.
    Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
      /docs/concepts/cluster-administration/addons/
    
    You can now join any number of machines by running the following on each node
    as root:
    
      kubeadm join <control-plane-host>:<control-plane-port> --token <token> --discovery-token-ca-cert-hash sha256:<hash>
    

    最下面的kubeadm join …这句要保存好,一会添加节点要用到。

    如果是非root用户使用kubectl ,执行:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    

    如果是root用户

    export KUBECONFIG=/etc/kubernetes/admin.conf
    

    安装pod网络插件:

    可以选用其中任何一种

    https://kubernetes.io/docs/concepts/cluster-administration/networking/#how-to-implement-the-kubernetes-networking-model

    我用的是flannel:

    https://github.com/flannel-io/flannel#flannel

    执行

    kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
    

    查看网络插件是否安装成功

    kubectl get pods --all-namespaces
    

    如果CoreDNS Pod 是running状态即为成功,可以进行添加node节点了,否则要查看问题在哪里。

  5. 加入node(在节点机器上操作)

    在节点机器上执行kubeadm init输出的kubeadm Join命令

    kubeadm join --token <token> <control-plane-host>:<control-plane-port> --discovery-token-ca-cert-hash sha256:<hash>
    

    输出如下:

    [preflight] Running pre-flight checks
    
    ... (log output of join workflow) ...
    
    Node join complete:
    * Certificate signing request sent to control-plane and response
      received.
    * Kubelet informed of new secure connection details.
    
    Run 'kubectl get nodes' on control-plane to see this machine join.
    
  6. 安装kuboard

    https://kuboard.cn/install/v3/install-in-k8s.html

    kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
    或
    kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3-swr.yaml
    
    kubectl get pods -n kuboard
    
    
    

    就绪后打开http://your-node-ip-address:30080,用户名admin,密码Kuboard123

  7. ToubleShooting

    flannel报错,CrashLoopBackOff,报错信息:

    Error registering network: failed to acquire lease: node “k8s-master-1” pod cidr not assigned
    

    报错原因/解决方法:

    1.安装Kubeadm Init的时候,没有增加 --pod-network-cidr 10.244.0.0/16参数
    注意,安装Flannel时,kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml如果yml中的"Network": "10.244.0.0/16"和init 时传入的–pod-network-cidr不一样,就修改成一样的。不然可能会使得Node间Cluster IP不通。

    2.kube-controller-manager 没有给新加入的节点分配IP段.
    编辑 master 机器上的 /etc/kubernetes/manifests/kube-controller-manager.yaml文件加上下面两句:

–allocate-node-cidrs=true
–cluster-cidr=10.244.0.0/16

这个cluster-cidr要和kube-flannel.yml里面的地址一致,并要和kube-proxy.config.yaml里面的clusterCIDR一致

 - command:
  - kube-controller-manager
  - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
  - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
  - --bind-address=127.0.0.1
  - --client-ca-file=/etc/kubernetes/pki/ca.crt
  - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
  - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
  - --controllers=*,bootstrapsigner,tokencleaner
  - --kubeconfig=/etc/kubernetes/controller-manager.conf
  - --leader-elect=true
  - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
  - --root-ca-file=/etc/kubernetes/pki/ca.crt
  - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
  - --use-service-account-credentials=true
  - --allocate-node-cidrs=true
  - --cluster-cidr=10.244.0.0/16

如果你还没有还原快照(kubeadm reset),采用方法二,然后 kubectl delete pod -n kube-system kube-flannel-*,将三个错误的 flannel-pod删除,即可自动重新创建新的 flannel-pod。

如果你恢复快照了,那么在 kubeadm init 时加上 --pod-network-cidr=10.244.0.0/16 参数即可。

添加节点:

kubeadm join 192.168.17.140:6443 --token 0ctfq1.6v9bkoxg4u8muhqx \
	--discovery-token-ca-cert-hash sha256:8e300dc48b25249b72aa365470aacb91ebec5ea240d992ef4a8242fb4a47a9b4

添加节点后,节点NotReady

container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

解决方法:创建cni网络相关配置文件,从master节点上拷一套过来

sudo mkdir -p /run/flannel/
sudo scp lijian@vm2:/run/flannel/subnet.env /run/flannel/subnet.env
sudo mkdir -p /etc/cni/net.d
sudo scp lijian@vm2:/etc/cni/net.d/10-flannel.conflist  /etc/cni/net.d/

其他问题参考:https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm/

2.二进制方式部署

参考(由主到次):

https://www.kubernetes.org.cn/4963.html

controlelr-manger和schuduler认证 :https://blog.csdn.net/wangshui898/article/details/120132028

coredns: https://blog.csdn.net/weixin_45444133/article/details/116405713

下载链接:

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#server-binaries

https://github.com/etcd-io/etcd/releases

https://github.com/flannel-io/flannel/releases

https://docs.docker.com/engine/install/ubuntu/

https://github.com/coredns/deployment

目录安排:

/opt/k8s/ 安装包

/opt/k8s/*-cert 证书配置文件

/k8s/etcd etcd部署目录 ,bin 二进制文件,cfg配置文件,ssl证书文件

/k8s/kubernetes k8s组件部署目录,bin 二进制文件,cfg配置文件,ssl证书文件

环境:

Roleip系统
master192.168.17.143Ubuntu 18.04.5 LTS amd64
node1192.168.17.144Ubuntu 18.04.5 LTS amd64
node2192.168.17.145Ubuntu 18.04.5 LTS amd64

etcd集群部署三台节点

master节点部署控制组件和kubelet、kube-proxy、flannel

node节点部署kubelet、kube-proxy、flannel

–cluster-cidr=10.244.0.0/16

用于flannel配置中的Network、kube-controller-manager的配置字段–cluster-cidr、kube-proxy配置中的字段clusterCIDR

操作系统: ubuntu 18.04

版本:

softwareversion
k8s1.23.5
etcd3.5.3
flannel0.17.0
docker20.10.14
coredns1.14.0

1.初始化环境

  1. 安装虚拟机

    新建一个ubuntu虚拟机,安装好后复制两份虚拟机文件。打开复制的虚拟机时,根据提示选择“我已经复制虚拟机”,vm会自动为复制的虚拟机生成新的ip和mac,无需手动配置。

  2. 准备工作(在所有节点上操作)

    修改主机名:

    永久修改主机名,修改 /etc/hostname,重启生效

    sudo vi /etc/hostname
    

    启动后安装ssh server

    sudo apt-get install openssh-server
    sudo service sshd status
    

    关闭swap

    sudo vi /etc/fstab 
    

    注释掉 /swapfile none swap sw 0 0,

    重启系统, 验证是否关闭,没有输出就是已关闭

    sudo swapon --show
    

    设置Docker所需参数

    cat << EOF | tee /etc/sysctl.d/k8s.conf
    net.ipv4.ip_forward = 1
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    EOF
    sysctl -p /etc/sysctl.d/k8s.conf
    

    安装docker:https://docs.docker.com/engine/install/ubuntu/

    配置docker开机自启:

    sudo systemctl enable docker
    

    修改docker配置,cgroup dirvers为systemd

    sudo tee /etc/docker/daemon.json <<-'EOF'
    {
    "registry-mirrors": ["https://vmjo8s1n.mirror.aliyuncs.com"],
    "exec-opts": ["native.cgroupdriver=systemd"]
    }
    EOF
    sudo systemctl daemon-reload
    sudo systemctl restart docker
    

    通过命令sudo docker info,查看Cgroup Driver: systemd是否正确。

    普通用户配置sudo免密

    sudo visudo
    

    末尾添加 用户名 ALL=NOPASSWD:ALL

    lijian ALL=NOPASSWD:ALL
    

    创建安装目录

    mkdir /k8s/etcd/{bin,cfg,ssl} -p
    mkdir /k8s/kubernetes/{bin,cfg,ssl} -p
    

    安装及配置cfssl

    wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
    wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
    wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
    chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
    mv cfssl_linux-amd64 /usr/local/bin/cfssl
    mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
    mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
    

    创建ETCD认证证书

    lijian@vma:/opt/k8s/etcd-cert$ cat ca-config.json 
    {
      "signing": {
        "default": {
          "expiry": "87600h"
        },
        "profiles": {
          "www": {
             "expiry": "87600h",
             "usages": [
                "signing",
                "key encipherment",
                "server auth",
                "client auth"
            ]
          }
        }
      }
    }
    

    创建 ETCD CA 配置文件

    lijian@vma:/opt/k8s/etcd-cert$ cat ca-csr.json
    {
        "CN": "etcd CA",
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "CN",
                "L": "suzhou",
                "ST": "suzhou"
            }
        ]
    }
    
    

    创建 ETCD Server 证书

    lijian@vma:/opt/k8s/etcd-cert$ cat server-csr.json
    {
        "CN": "etcd",
        "hosts": [
        "192.168.17.143",
        "192.168.17.144",
        "192.168.17.145"
        ],
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "CN",
                "L": "suzhou",
                "ST": "suzhou"
            }
        ]
    }
    
    

    生成 ETCD CA 证书和私钥

    cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server
    

    把etcd的证书拷贝到/opt/k8s/kubernetes-cert目录下:

    lijian@vma:/opt/k8s/kubernetes-cert$ ls -lrt /k8s/etcd/ssl/
    total 16
    -rw-rw-r-- 1 lijian lijian 1334 4月  14 17:04 server.pem
    -rw------- 1 lijian lijian 1675 4月  14 17:04 server-key.pem
    -rw-rw-r-- 1 lijian lijian 1261 4月  14 17:04 ca.pem
    -rw------- 1 lijian lijian 1675 4月  14 17:04 ca-key.pem
    
    

    创建 Kubernetes CA 证书

    lijian@vma:/opt/k8s/kubernetes-cert$ cat ca-config.json 
    {
      "signing": {
        "default": {
          "expiry": "87600h"
        },
        "profiles": {
          "kubernetes": {
             "expiry": "87600h",
             "usages": [
                "signing",
                "key encipherment",
                "server auth",
                "client auth"
            ]
          }
        }
      }
    }
    
    
    lijian@vma:/opt/k8s/kubernetes-cert$ cat ca-csr.json 
    {
        "CN": "kubernetes",
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "CN",
                "L": "suzhou",
                "ST": "suzhou",
                "O": "k8s",
                "OU": "System"
            }
        ]
    }
    
    
    cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
    

    生成API_SERVER证书

    lijian@vma:/opt/k8s/kubernetes-cert$ cat server-csr.json 
    {
        "CN": "kubernetes",
        "hosts": [
          "10.0.0.1",
          "127.0.0.1",
          "192.168.17.143",
          "192.168.17.144",
          "192.168.17.145",
          "kubernetes",
          "kubernetes.default",
          "kubernetes.default.svc",
          "kubernetes.default.svc.cluster",
          "kubernetes.default.svc.cluster.local"
        ],
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "CN",
                "L": "suzhou",
                "ST": "suzhou",
                "O": "k8s",
                "OU": "System"
            }
        ]
    }
    
    
    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
    

    创建 Kubernetes Proxy 证书

    lijian@vma:/opt/k8s/kubernetes-cert$ cat kube-proxy-csr.json 
    {
      "CN": "system:kube-proxy",
      "hosts": [],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "L": "suzhou",
          "ST": "suzhou",
          "O": "k8s",
          "OU": "System"
        }
      ]
    }
    
    
    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
    

    把k8s的证书拷贝到/k8s/kubernetes/ssl/目录下

    lijian@vma:/opt/k8s/kubernetes-cert$ ls -lrt /k8s/kubernetes/ssl/
    total 64
    -rw-rw-r-- 1 lijian lijian 1354 4月  15 16:19 ca.pem
    -rw------- 1 lijian lijian 1675 4月  15 16:19 ca-key.pem
    -rw-rw-r-- 1 lijian lijian 1623 4月  18 14:11 server.pem
    -rw------- 1 lijian lijian 1675 4月  18 14:11 server-key.pem
    -rw-rw-r-- 1 lijian lijian 1395 4月  18 17:20 kube-proxy.pem
    -rw------- 1 lijian lijian 1679 4月  18 17:20 kube-proxy-key.pem
    
    

    tips: 查看systemctl 服务日志:

    journalctl -u {servicename} [-xe -f]
    

2.部署etcd

https://github.com/etcd-io/etcd/releases/download/v3.5.3/etcd-v3.5.3-linux-amd64.tar.gz

解压安装文件

tar -xvf etcd-v3.5.3-linux-amd64.tar.gz
cd etcd-v3.5.3-linux-amd64/
cp etcd etcdctl /k8s/etcd/bin/
cp etcdctl /usr/local/bin

配置文件

lijian@vma:/opt/k8s/etcd-v3.5.3-linux-amd64$ cat /k8s/etcd/cfg/etcd   
#[Member]
ETCD_NAME="etcd01"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.17.143:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.17.143:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.17.143:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.17.143:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://192.168.17.143:2380,etcd02=https://192.168.17.144:2380,etcd03=https://192.168.17.145:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

创建 etcd的 systemd unit 文件

lijian@vma:/opt/k8s/etcd-v3.5.3-linux-amd64$ cat /usr/lib/systemd/system/etcd.service 
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=/k8s/etcd/cfg/etcd
ExecStart=/k8s/etcd/bin/etcd --enable-v2 \
--cert-file=/k8s/etcd/ssl/server.pem \
--key-file=/k8s/etcd/ssl/server-key.pem \
--peer-cert-file=/k8s/etcd/ssl/server.pem \
--peer-key-file=/k8s/etcd/ssl/server-key.pem \
--trusted-ca-file=/k8s/etcd/ssl/ca.pem \
--peer-trusted-ca-file=/k8s/etcd/ssl/ca.pem
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

启动ETCD服务

systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl sttus etcd

将启动文件与配置文件拷贝到节点1、节点2

scp -r /k8s/etcd/ ......
scp /usr/lib/systemd/system/etcd.service ......

修改节点1的配置文件:

lijian@k8s-node1:~$ cat /k8s/etcd/cfg/etcd 
#[Member]
ETCD_NAME="etcd02"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.17.144:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.17.144:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.17.144:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.17.144:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://192.168.17.143:2380,etcd02=https://192.168.17.144:2380,etcd03=https://192.168.17.145:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

修改节点2的配置文件:

lijian@k8s-node2:~$ cat /k8s/etcd/cfg/etcd 
#[Member]
ETCD_NAME="etcd03"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.17.145:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.17.145:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.17.145:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.17.145:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://192.168.17.143:2380,etcd02=https://192.168.17.144:2380,etcd03=https://192.168.17.145:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

启动节点1和节点2的etcd

验证集群是否正常运行

验证集群状态

etcdctl --cacert=/k8s/etcd/ssl/ca.pem --cert=/k8s/etcd/ssl/server.pem --key=/k8s/etcd/ssl/server-key.pem --endpoints="https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379" endpoint health

etcdctl --cacert=/k8s/etcd/ssl/ca.pem --cert=/k8s/etcd/ssl/server.pem --key=/k8s/etcd/ssl/server-key.pem --endpoints="https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379" endpoint status -w table

3.部署flannel

下载 https://github.com/flannel-io/flannel/releases

向 etcd 写入集群 Pod 网段信息

  • flanneld 当前版本 不支持 etcd v3,故使用 etcd v2 API 写入配置 key 和网段数据;用v3版本写入会报错:Couldn’t fetch network config: 100: Key not found (/coreos.com) [11]
  • 写入的 Pod 网段 ${CLUSTER_CIDR} 必须是 /16 段地址,必须与 kube-controller-manager 的 –cluster-cidr 参数值一致;

1.开启 etcd v2 API 接口,按照这个手册配置的话前面已经开启了

在 etcd 启动命令里面添加上如下内容,然后重启 etcd 集群
--enable-v2

2.使用 etcd v2 去创建 flannel 的网络配置

ETCDCTL_API=2 etcdctl --ca-file=/k8s/etcd/ssl/ca.pem --cert-file=/k8s/etcd/ssl/server.pem --key-file=/k8s/etcd/ssl/server-key.pem --endpoints="https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379"  set /coreos.com/network/config  '{ "Network": "10.244.0.0/16", "Backend": {"Type": "vxlan"}}'

解压安装

tar -xvf flannel-v0.17.0-linux-amd64.tar.gz
mv flanneld mk-docker-opts.sh /k8s/kubernetes/bin/

配置Flannel

lijian@vma:/opt/k8s$ cat /k8s/kubernetes/cfg/flanneld
FLANNEL_OPTIONS="--etcd-endpoints=https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379 -etcd-cafile=/k8s/etcd/ssl/ca.pem -etcd-certfile=/k8s/etcd/ssl/server.pem -etcd-keyfile=/k8s/etcd/ssl/server-key.pem -iface=ens33"

  • flanneld 使用系统缺省路由所在的接口与其它节点通信,对于有多个网络接口(如内网和公网)的节点,可以用 -iface 参数指定通信接口,如上面的 ens33 接口;

创建 flanneld 的 systemd unit 文件

lijian@vma:/opt/k8s$ cat /usr/lib/systemd/system/flanneld.service
[Unit]
Description=Flanneld overlay address etcd agent
After=network-online.target network.target
Before=docker.service

[Service]
Type=notify
EnvironmentFile=/k8s/kubernetes/cfg/flanneld
ExecStart=/k8s/kubernetes/bin/flanneld --ip-masq $FLANNEL_OPTIONS
ExecStartPost=/k8s/kubernetes/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d  /run/flannel/docker 
Restart=on-failure

[Install]
WantedBy=multi-user.target
  • mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网网段信息写入 /run/flannel/docker 文件,并且会生成/run/flannel/subnet.env文件,后续 docker 启动时 使用/run/flannel/dockerv文件中的环境变量配置 docker0 网桥;

启动flannel服务

systemctl daemon-reload
systemctl enable flanneld
systemctl start flanneld
systemctl sttus flanneld

配置Docker启动指定子网段,注意路径是/lib/systemd/system/docker.service,增加EnvironmentFile配置,修改ExecStart添加$DOCKER_NETWORK_OPTIONS参数

lijian@vma:/opt/k8s$ cat /lib/systemd/system/docker.service 
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target docker.socket firewalld.service
Wants=network-online.target
Requires=docker.socket

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
EnvironmentFile=/run/flannel/docker
ExecStart=/usr/bin/dockerd  $DOCKER_NETWORK_OPTIONS -H fd://
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

重启docker服务

systemctl daemon-reload
systemctl restart docker

查看flannel和docker0的网段是否一致,一致就可以了

lijian@vma:/opt/k8s$ ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000
    link/ether 00:0c:29:ea:4e:08 brd ff:ff:ff:ff:ff:ff
    inet 192.168.17.143/24 brd 192.168.17.255 scope global dynamic noprefixroute ens33
       valid_lft 1413sec preferred_lft 1413sec
    inet6 fe80::679:d74e:a380:8475/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether 26:5f:27:9a:9e:f5 brd ff:ff:ff:ff:ff:ff
    inet 10.244.62.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::245f:27ff:fe9a:9ef5/64 scope link 
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:45:72:92:ac brd ff:ff:ff:ff:ff:ff
    inet 10.244.62.1/24 brd 10.244.62.255 scope global docker0
       valid_lft forever preferred_lft forever

将flanneld systemd unit 文件拷贝到所有节点

scp -r /k8s/kubernetes ......
scp /usr/lib/systemd/system/flanneld.service ......
scp /lib/systemd/system/docker.service ......

在所有节点上启动flannel和重启docker,检查网段是否一致。

问题:docker0和flannel.1网段不一致,docker0配置无效

解决:docker0的启动文件在/lib/systemd/system/docker.service,前面没有/usr,之前搞错了

卸载flannled和docker0:

sudo systemctl stop flanneld
ip addr s flannel.1

sudo ifconfig flannel.1 down
ip addr s flannel.1

sudo ip link del flannel.1
ip addr s flannel.1

卸载docker0:

brctl delbr docker0
如果使用此种方式删除flannel产生的vxlan设备,会得到如下提示:
[root@host131 ~]# brctl delbr flannel.1
can't delete bridge flannel.1: Operation not permitted
如果没有相关命令,安装即可

重置flannel的话注意记得删除etcd存储的信息:/coreos.com/network/subnets 里的所有内容

ETCDCTL_API=2 etcdctl --ca-file=/k8s/etcd/ssl/ca.pem --cert-file=/k8s/etcd/ssl/server.pem --key-file=/k8s/etcd/ssl/server-key.pem --endpoints="https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379"  ls /coreos.com/network/subnets

4.部署master

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#server-binaries

kubernetes master 节点运行如下组件:

  • kube-apiserver

  • kube-scheduler

  • kube-controller-manager

    kube-scheduler 和 kube-controller-manager 可以以集群模式运行,通过 leader 选举产生一个工作进程,其它进程处于阻塞模式。

部署kube-apiserver

将二进制文件解压拷贝到master 节点

tar -xvf kubernetes-server-linux-amd64.tar.gz 
cd kubernetes/server/bin/
cp kube-scheduler kube-apiserver kube-controller-manager kubectl /k8s/kubernetes/bin/

拷贝认证

cp /opt/k8s/kubernetes-cert/*pem /k8s/kubernetes/ssl/

创建 TLS Bootstrapping Token

# head -c 16 /dev/urandom | od -An -t x | tr -d ' '
2366a641f656a0a025abb4aabda4511b

vim /k8s/kubernetes/cfg/token.csv
2366a641f656a0a025abb4aabda4511b,kubelet-bootstrap,10001,"system:kubelet-bootstrap"

创建apiserver配置文件

lijian@vma:/opt/k8s/kubernetes-cert$ cat /k8s/kubernetes/cfg/kube-apiserver 
KUBE_APISERVER_OPTS="--logtostderr=true \
--v=4 \
--etcd-servers=https://192.168.17.143:2379,https://192.168.17.144:2379,https://192.168.17.145:2379 \
--bind-address=192.168.17.143 \
--insecure-bind-address=0.0.0.0 \
--secure-port=6443 \
--advertise-address=192.168.17.143 \
--allow-privileged=true \
--service-cluster-ip-range=10.0.0.0/24 \
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \
--authorization-mode=Node,RBAC \
--enable-bootstrap-token-auth \
--token-auth-file=/k8s/kubernetes/cfg/token.csv \
--service-node-port-range=30000-50000 \
--tls-cert-file=/k8s/kubernetes/ssl/server.pem  \
--tls-private-key-file=/k8s/kubernetes/ssl/server-key.pem \
--client-ca-file=/k8s/kubernetes/ssl/ca.pem \
--service-account-key-file=/k8s/kubernetes/ssl/ca.pub \
--service-account-signing-key-file=/k8s/kubernetes/ssl/ca-key.pem \
--service-account-issuer=api \
--etcd-cafile=/k8s/etcd/ssl/ca.pem \
--etcd-certfile=/k8s/etcd/ssl/server.pem \
--etcd-keyfile=/k8s/etcd/ssl/server-key.pem"

创建 kube-apiserver systemd unit 文件

lijian@vma:/opt/k8s/kubernetes-cert$ cat /usr/lib/systemd/system/kube-apiserver.service 
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kube-apiserver
ExecStart=/k8s/kubernetes/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure

[Install]
WantedBy=multi-user.target

启动服务

systemctl daemon-reload
systemctl enable kube-apiserver
systemctl restart kube-apiserver
systemctl status kube-apiserver

查看apiserver是否运行

ps -ef |grep kube-apiserver

报错:Error: [service-account-issuer is a required flag

解决:生成ca.pub,修改apiserver的参数

openssl x509 -in ca.pem -pubkey -noout > ca.pub
--service-account-key-file=/k8s/kubernetes/ssl/ca.pub \
--service-account-signing-key-file=/k8s/kubernetes/ssl/ca-key.pem \
--service-account-issuer=api \
部署kube-scheduler

创建证书

lijian@vma:~$ cat /opt/k8s/kubernetes-cert/kube-scheduler-csr.json 
{
"CN": "system:kube-scheduler",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "suzhou",
"ST": "suzhou",
"O": "system:masters",
"OU": "System"
}
]
}

生成证书,并拷贝到 /k8s/kubernetes/ssl/

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
-rw-rw-r-- 1 lijian lijian 1415 4月  18 10:46 kube-scheduler.pem
-rw------- 1 lijian lijian 1675 4月  18 10:46 kube-scheduler-key.pem

生成kubeconfig文件,执行kube-scheduler_env.sh

lijian@vma:~$ cat /k8s/kubernetes/cfg/kube-scheduler_env.sh 
KUBE_CONFIG="/k8s/kubernetes/cfg/kube-scheduler.kubeconfig"
KUBE_APISERVER="https://192.168.17.143:6443"
kubectl config set-cluster kubernetes \
	--certificate-authority=/k8s/kubernetes/ssl/ca.pem \
	--embed-certs=true \
	--server=${KUBE_APISERVER} \
	--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-scheduler \
	--client-certificate=/k8s/kubernetes/ssl/kube-scheduler.pem \
	--client-key=/k8s/kubernetes/ssl/kube-scheduler-key.pem \
	--embed-certs=true \
	--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
	--cluster=kubernetes \
	--user=kube-scheduler \
	--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}

创建conf配置文件

lijian@vma:~$ cat /k8s/kubernetes/cfg/kube-scheduler
KUBE_SCHEDULER_OPTS="--logtostderr=true --v=4 \
--kubeconfig=/k8s/kubernetes/cfg/kube-scheduler.kubeconfig \
--bind-address=127.0.0.1 \
--leader-elect"

  • –kubeconfig:连接apiserver配置文件
  • –leader-elect:当该组件启动多个时,自动选举(HA)

创建systemd启动文件

lijian@vma:~$ cat /usr/lib/systemd/system/kube-scheduler.service
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kube-scheduler
ExecStart=/k8s/kubernetes/bin/kube-scheduler $KUBE_SCHEDULER_OPTS
Restart=on-failure

[Install]
WantedBy=multi-user.target

启动kube-scheduler

systemctl daemon-reload
systemctl restart kube-scheduler
systemctl enable kube-scheduler

查看kube-scheduler是否运行

ps -ef |grep kube-scheduler 
部署kube-controller-manager

生成证书

lijian@vma:~$ cat /opt/k8s/kubernetes-cert/kube-controller-manager-csr.json 
{
"CN": "system:kube-controller-manager",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "suzhou", 
"ST": "suzhou",
"O": "system:masters",
"OU": "System"
}
]
}
# 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
# 拷贝到/k8s/kubernetes/ssl/
-rw-rw-r-- 1 lijian lijian 1432 4月  18 10:24 kube-controller-manager.pem
-rw------- 1 lijian lijian 1675 4月  18 10:24 kube-controller-manager-key.pem

生成kubeconfig文件,执行kube-controller_env.sh

lijian@vma:~$ cat /k8s/kubernetes/cfg/kube-controller_env.sh 
KUBE_CONFIG="/k8s/kubernetes/cfg/kube-controller-manager.kubeconfig"
KUBE_APISERVER="https://192.168.17.143:6443"
kubectl config set-cluster kubernetes \
	--certificate-authority=/k8s/kubernetes/ssl/ca.pem \
	--embed-certs=true \
	--server=${KUBE_APISERVER} \
	--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-controller-manager \
	--client-certificate=/k8s/kubernetes/ssl/kube-controller-manager.pem \
	--client-key=/k8s/kubernetes/ssl/kube-controller-manager-key.pem \
	--embed-certs=true \
	--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
	--cluster=kubernetes \
	--user=kube-controller-manager \
	--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}

创建conf配置文件

lijian@vma:~$ cat /k8s/kubernetes/cfg/kube-controller-manager
KUBE_CONTROLLER_MANAGER_OPTS="--logtostderr=true \
--v=4 \
--leader-elect=true \
--kubeconfig=//k8s/kubernetes/cfg/kube-controller-manager.kubeconfig \
--bind-address=127.0.0.1 \
--service-cluster-ip-range=10.0.0.0/24 \
--allocate-node-cidrs=true \
--cluster-name=kubernetes \
--cluster-cidr=10.244.0.0/16 \
--service-cluster-ip-range=10.0.0.0/24 \
--cluster-signing-cert-file=/k8s/kubernetes/ssl/ca.pem \
--cluster-signing-key-file=/k8s/kubernetes/ssl/ca-key.pem  \
--root-ca-file=/k8s/kubernetes/ssl/ca.pem \
--service-account-private-key-file=/k8s/kubernetes/ssl/ca-key.pem \
--cluster-signing-duration=87600h0m0s"
  • –kubeconfig:连接apiserver配置文件
  • –leader-elect:当该组件启动多个时,自动选举(HA)
  • –cluster-signing-cert-file/–cluster-signing-key-file:自动为kubelet颁发证书的CA,与apiserver保持一致

创建systemd启动文件

lijian@vma:~$ cat /usr/lib/systemd/system/kube-controller-manager.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kube-controller-manager
ExecStart=/k8s/kubernetes/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure

[Install]
WantedBy=multi-user.target

启动kube-controller-manager

systemctl daemon-reload
systemctl restart kube-controller-manager
systemctl enable kube-controller-manager
systemctl status kube-controller-manager

将可执行文件路/k8s/kubernetes/ 添加到 PATH 变量中

vim /etc/profile
PATH=/k8s/kubernetes/bin:$PATH:$HOME/bin
source /etc/profile
制作kubectl 命令行工具

1.cat >> admin-csr.json << EOF

lijian@vma:/opt/k8s/kubernetes-cert$ cat admin-csr.json
{
  "CN": "admin",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "suzhou",
      "L": "suzhou",
      "O": "system:masters",
      "OU": "System"
    }
  ]
}

2.制作证书

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin

3.设置集群参数

kubectl config set-cluster kubernetes \
   --certificate-authority=/k8s/kubernetes/ssl/ca.pem \
   --embed-certs=true \
   --server=https://192.168.17.143:6443

4.设置客户端认证参数

kubectl config set-credentials admin \
   --client-certificate=/k8s/kubernetes/ssl/admin.pem \
   --embed-certs=true \
   --client-key=/k8s/kubernetes/ssl/admin-key.pem

5.设置上下文参数

kubectl config set-context kubernetes \
   --cluster=kubernetes \
   --user=admin

6.设置默认上下文

kubectl config use-context kubernetes
kubectl get cs

查看master集群状态

lijian@vma:~$ kubectl get cs,nodes
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                                 STATUS    MESSAGE                         ERROR
componentstatus/etcd-1               Healthy   {"health":"true","reason":""}   
componentstatus/etcd-0               Healthy   {"health":"true","reason":""}   
componentstatus/etcd-2               Healthy   {"health":"true","reason":""}   
componentstatus/scheduler            Healthy   ok                              
componentstatus/controller-manager   Healthy   ok                              

NAME                  STATUS                     ROLES    AGE     VERSION
node/192.168.17.143   Ready,SchedulingDisabled   master   4d5h    v1.23.5
node/192.168.17.144   Ready                      node     3d22h   v1.23.5
node/192.168.17.145   Ready                      node     3d22h   v1.23.5

5.部署node节点

kubernetes work 节点运行如下组件:

  • docker 前面已经部署
  • kubelet
  • kube-proxy

将kubelet和kube-proxy 二进制文件拷贝node节点

scp /k8s/kubernetes/bin/kubelet /k8s/kubernetes/bin/kube-proxy ...
部署kubelet

创建yml参数配置文件

lijian@vma:~$ cat /k8s/kubernetes/cfg/kubelet-config.yml 
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 192.168.17.143 
port: 10250
readOnlyPort: 10255
cgroupDriver: systemd
clusterDNS: ["10.0.0.2"]
clusterDomain: cluster.local.
failSwapOn: false
authentication:
  anonymous:
    enabled: true

创建bootstrap.kubeconfig配置文件

  • kubelet初次加入集群引导kubeconfig文件

创建bootstrap.kubeconfig配置文件,执行如下脚本

lijian@vma:~$ cat /k8s/kubernetes/cfg/kubelet_env.sh 
KUBE_CONFIG="/k8s/kubernetes/cfg/bootstrap.kubeconfig"
KUBE_APISERVER="https://192.168.17.143:6443"
TOKEN=`cat /k8s/kubernetes/cfg/token.csv|awk -F',' '{print $1}'` # 与token.csv里保持一致

# 生成 kubelet bootstrap kubeconfig 配置文件
kubectl config set-cluster kubernetes \
	  --certificate-authority=/k8s/kubernetes/ssl/ca.pem \
	    --embed-certs=true \
	      --server=${KUBE_APISERVER} \
	        --kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials "kubelet-bootstrap" \
	  --token=${TOKEN} \
	    --kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
	  --cluster=kubernetes \
	    --user="kubelet-bootstrap" \
	      --kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}

创建conf配置文件

lijian@vma:~$ cat /k8s/kubernetes/cfg/kubelet.config 
KUBELET_OPTS="--logtostderr=true \
--v=4 \
--log-dir=/opt/k8s/logs \
--hostname-override=192.168.17.143 \
--kubeconfig=/k8s/kubernetes/cfg/kubelet.kubeconfig \
--bootstrap-kubeconfig=/k8s/kubernetes/cfg/bootstrap.kubeconfig \
--config=/k8s/kubernetes/cfg/kubelet-config.yml \
--cert-dir=/k8s/kubernetes/ssl \
--cgroup-driver=systemd \
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"

  • –hostname-override:显示名称,为节点hostname, 集群中唯一
  • –kubeconfig:会自动生成,后面用于连接apiserver
  • –bootstrap-kubeconfig:首次启动向apiserver申请证书
  • –config:配置参数文件
  • –cert-dir:kubelet证书生成目录
  • –pod-infra-container-image:管理Pod网络容器的镜像

创建systemd启动文件

lijian@vma:~$ cat /usr/lib/systemd/system/kubelet.service 
[Unit]
Description=Kubernetes Kubelet
After=docker.service
Requires=docker.service

[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kubelet.config
ExecStart=/k8s/kubernetes/bin/kubelet $KUBELET_OPTS
Restart=on-failure
KillMode=process

[Install]
WantedBy=multi-user.target

启动kubelet

systemctl daemon-reload
systemctl restart kubelet
systemctl enable kubelet
systemctl status kubelet

启动成功后,会在/k8s/kubernetes/ssl/下自动生成几个证书文件

-rw------- 1 root   root   1224 4月  18 10:39 kubelet-client-2022-04-18-10-39-24.pem
lrwxrwxrwx 1 root   root     58 4月  18 10:39 kubelet-client-current.pem -> /k8s/kubernetes/ssl/kubelet-client-2022-04-18-10-39-24.pem
-rw-r--r-- 1 root   root   2279 4月  16 21:30 kubelet.crt
-rw------- 1 root   root   1675 4月  16 21:30 kubelet.key

/k8s/kubernetes/cfg下自动生成kubelet.kubeconfig文件

-rw------- 1 root   root   2293 4月  18 10:39 kubelet.kubeconfig

同步kubelet配置到其余节点

同步kubelet.conf, kubelet-config.yml, bootstrap.kubeconfig, kubelet.service到所有节点, 修改kubelet.conf中hostname-override参数为对应节点的hostname

节点1的配置文件

lijian@k8s-node1:~$ cat /k8s/kubernetes/cfg/kubelet.config 
KUBELET_OPTS="--logtostderr=true \
--v=4 \
--log-dir=/opt/k8s/logs \
--hostname-override=192.168.17.144 \
--kubeconfig=/k8s/kubernetes/cfg/kubelet.kubeconfig \
--bootstrap-kubeconfig=/k8s/kubernetes/cfg/bootstrap.kubeconfig \
--config=/k8s/kubernetes/cfg/kubelet-config.yml \
--cert-dir=/k8s/kubernetes/ssl \
--cgroup-driver=systemd \
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"

节点2的配置文件:

lijian@k8s-node2:/k8s/etcd/bin$ cat /k8s/kubernetes/cfg/kubelet.config 
KUBELET_OPTS="--logtostderr=true \
--v=4 \
--log-dir=/opt/k8s/logs \
--hostname-override=192.168.17.145 \
--kubeconfig=/k8s/kubernetes/cfg/kubelet.kubeconfig \
--bootstrap-kubeconfig=/k8s/kubernetes/cfg/bootstrap.kubeconfig \
--config=/k8s/kubernetes/cfg/kubelet-config.yml \
--cert-dir=/k8s/kubernetes/ssl \
--cgroup-driver=systemd \
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"

其余节点启动kubelet

如果启动kubelet时候报错,cannot create certificate signing request:

需要在master上操作,授权kubelet-bootstrap用户允许请求证书

lijian@vma:/k8s/kubernetes/cfg$ cat kubelet-bootstrap-rabc.yaml 
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: create-csrs-for-bootstrapping
subjects:
- kind: Group
  name: system:bootstrappers
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: system:node-bootstrapper
  apiGroup: rbac.authorization.k8s.io 
  # 执行
  kubectl apply -f /opt/k8s/yaml/kubelet-bootstrap-rbac.yaml

重新启动kubelet,然后在master上查看证书申请, Pending状态

kubectl get csr

批准kubelet证书申请并加入集群

 for csr in `kubectl get csr |awk 'NR>1 {print $1}'`;do kubectl certificate approve $csr;done

再次查看证书申请,应该是 Approved,Issued

查看节点状态

lijian@vma:/k8s/kubernetes/cfg$ kubectl get node
NAME             STATUS                     ROLES    AGE     VERSION
192.168.17.143   Ready,SchedulingDisabled   master   4d5h    v1.23.5
192.168.17.144   Ready                      node     3d23h   v1.23.5
192.168.17.145   Ready                      node     3d23h   v1.23.5

问题:controller-manager和scheduler 报错:连接不到127.0.0.1:8080,导致csr一直没有被issued,kubectl get ndoes差不到节点。

原因:没有为他俩认证,

解决:用kube-controller-manager-csr.json和kube-scheduler-csr.json分别生成证书,使用kubectl生成kube-scheduler.kubeconfig和kube-controller-manager.kubeconfig,并且配置到服务中。

部署kube-proxy

创建配置文件kube-proxy-config.yml

lijian@vma:/k8s/kubernetes/cfg$ cat kube-proxy-config.yml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
iptables:
  masqueradeAll: true
  masqueradeBit: null
  minSyncPeriod: 0s
  syncPeriod: 0s
ipvs:
  masqueradeAll: true
  excludeCIDRs: null
  minSyncPeriod: 0s
  scheduler: "rr"
  strictARP: false
  syncPeriod: 0s
  tcpFinTimeout: 0s
  tcpTimeout: 0s
  udpTimeout: 0s
mode: "ipvs"
clientConnection:
  kubeconfig: /k8s/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 192.168.17.143
clusterCIDR: 10.244.0.0/16 

  • bindAddress: 监听地址;
  • clientConnection.kubeconfig: 连接 apiserver 的 kubeconfig 文件;
  • clusterCIDR: kube-proxy 根据 –cluster-cidr 判断集群内部和外部流量,指定 –cluster-cidr 或 –masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
  • hostnameOverride: 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 ipvs 规则;
  • mode: 使用 ipvs 模式;

生成kube-proxy.kubeconfig文件,执行kube-proxy_env.sh,kube-proxy证书在开头创建过了

lijian@vma:/k8s/kubernetes/cfg$ cat kube-proxy_env.sh 
KUBE_CONFIG="/k8s/kubernetes/cfg/kube-proxy.kubeconfig"
KUBE_APISERVER="https://192.168.17.143:6443"
TOKEN=`cat /k8s/kubernetes/cfg/token.csv|awk -F',' '{print $1}'` # 与token.csv里保持一致

# 生成 kubelet bootstrap kubeconfig 配置文件
kubectl config set-cluster kubernetes \
	  --certificate-authority=/k8s/kubernetes/ssl/ca.pem \
	    --embed-certs=true \
	      --server=${KUBE_APISERVER} \
	        --kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials "kube-proxy" \
	--client-certificate=/k8s/kubernetes/ssl/kube-proxy.pem \
	--client-key=/k8s/kubernetes/ssl/kube-proxy-key.pem \
	--embed-certs=true \
	--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
	  --cluster=kubernetes \
	  --user="kube-proxy" \
	  --kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}

创建conf配置文件

lijian@vma:/k8s/kubernetes/cfg$ cat kube-proxy.conf 
KUBE_PROXY_OPTS="--logtostderr=true \
--v=4 \
--log-dir=/opt/k8s/logs \
--config=/k8s/kubernetes/cfg/kube-proxy-config.yml"

创建systemd启动文件

lijian@vma:/k8s/kubernetes/cfg$ cat  /usr/lib/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Proxy
After=network.target

[Service]
EnvironmentFile=/k8s/kubernetes/cfg/kube-proxy.conf
ExecStart=/k8s/kubernetes/bin/kube-proxy $KUBE_PROXY_OPTS 
Restart=on-failure

[Install]
WantedBy=multi-user.target

启动kube-proxy

systemctl daemon-reload
systemctl restart kube-proxy
systemctl enable kube-proxy

同步kube-proxy配置到其余节点

同步kube-proxy.conf, kube-proxy-config.yml, kube-proxy.kubeconfig, kube-proxy.service到所有节点, 修改kube-proxy-config.yml配置文件中hostnameOverride参数为对应节点的hostname

节点1的配置

lijian@k8s-node1:~$ cat /k8s/kubernetes/cfg/kube-proxy-config.yml 
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
iptables:
  masqueradeAll: true
  masqueradeBit: null
  minSyncPeriod: 0s
  syncPeriod: 0s
ipvs:
  masqueradeAll: true
  excludeCIDRs: null
  minSyncPeriod: 0s
  scheduler: "rr"
  strictARP: false
  syncPeriod: 0s
  tcpFinTimeout: 0s
  tcpTimeout: 0s
  udpTimeout: 0s
mode: "ipvs"
clientConnection:
  kubeconfig: /k8s/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 192.168.17.144
clusterCIDR: 10.244.0.0/16 

节点2的配置

lijian@k8s-node2:/k8s/etcd/bin$ cat /k8s/kubernetes/cfg/kube-proxy-config.yml 
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
iptables:
  masqueradeAll: true
  masqueradeBit: null
  minSyncPeriod: 0s
  syncPeriod: 0s
ipvs:
  masqueradeAll: true
  excludeCIDRs: null
  minSyncPeriod: 0s
  scheduler: "rr"
  strictARP: false
  syncPeriod: 0s
  tcpFinTimeout: 0s
  tcpTimeout: 0s
  udpTimeout: 0s
mode: "ipvs"
clientConnection:
  kubeconfig: /k8s/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 192.168.17.145
clusterCIDR: 10.244.0.0/16 

其余节点启动kubelet

打node 或者master 节点的标签

kubectl label node 192.168.17.143  node-role.kubernetes.io/master='master'
kubectl label node 192.168.17.144  node-role.kubernetes.io/node='node'
kubectl label node 192.168.17.145  node-role.kubernetes.io/node='node'
lijian@vma:/k8s/kubernetes/cfg$ kubectl get nodes
NAME             STATUS                     ROLES    AGE     VERSION
192.168.17.143   Ready,SchedulingDisabled   master   4d5h    v1.23.5
192.168.17.144   Ready                      node     3d23h   v1.23.5
192.168.17.145   Ready                      node     3d23h   v1.23.5

6.部署coredns,(可以使pod内访问外网)

git clone https://github.com/coredns/deployment.git
cd /home/yaml/coredns/deployment/kubernetes/
#-i 指定集群dns,预先设置的可以查看 kubelet-config.yml
./deploy.sh -i 10.0.0.2 > coredns.yaml
#创建执行
kubectl  apply  -f coredns.yaml 
问题:
[FATAL] plugin/loop: Loop (127.0.0.1:33855 -> :53) detected for zone ".",
解决:
如果出现 [FATAL] plugin/loop: Loop (127.0.0.1:55751 -> :53) detected for zone "."
表示服务器的/etc/resolv.conf 的dns 设置 有包含 127.0.* 的字段,它识别失败
修改dns,在重启coredns服务
#修改节点配置文件
cat /etc/resolv.conf 
nameserver 114.114.114.114
删除pod,会重新启动

问题:节点Kubelet启动报错: bind: cannot assign requested address

解决:配置错误,kubelet.config中的–hostname-override改为节点IP,而不是master IP, kubelet-config.yml中的address: 0.0.0.0

另外sever-csr.json中的hosts加入各节点的IP,不知是否是根本原因

删除生成的kublet的证书 和kubelet.kubeconfig,重新启动kubelet.service

问题:Forbidden (user=system:anonymous, verb=get, resource=nodes, subresource=prox

解决: kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous

问题: 部署kuboard时, kuboard-agent 处于CrashLoopBackOff状态,报错Could not resolve host: kuboard-v3

原因:flannel和docker0网段不一致,解决:flannel的Network设置为10.244.0.0/16,controller-manager和kube-proxy的–cluster-cidr=10.244.0.0/16和 clusterCIDR也设为同样的值

问题: Error creating: pods “metrics-server-86597c44b6-79prc” is forbidden: SecurityContext.RunAsUser is forbidden

原因:准入控制插件限制

解决: 在kube-apiserver的配置文件中删除–enable-admission-plugins中的SecurityContextDeny,重启api-server即可。

7.安装kuboard(同kubeadm部署k8s部分)

省略

TIPS:

扩展虚拟机硬盘:安装gparted,通过图形界面操作resize

sudo apt-get install gparted

虚拟机系统文件损坏:

1、重启ubuntu,随即长按shirft进入grub菜单,或等待grub菜单的出现

2、选择recovery mode,接着用方向键将光标移至recovery mode,按"e"键进入编辑页面

3、将 ro recovery nomodeset 改为 rw single init=/bin/bash,在linux /boot 。。。那一行

4、 按 ctrl+x或者F10 进入单用户模式,当前用户即为root。这时候可以修改文件。修改完毕后重启即可。

附:源码编译方法

避免每次都要输入sudo,可以设置用户,注意执行后必须重启后登录

sudo usermod -a -G docker ${USER}

安装cfssl

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
chmod +x cfssl_linux-amd64
cp cfssl_linux-amd64 /usr/local/bin/cfssl
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x cfssljson_linux-amd64
cp cfssljson_linux-amd64 /usr/local/bin/cfssljson
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl-certinfo_linux-amd64
cp cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo

编译脚本:

# build kubelet
KUBE_BUILD_PLATFORMS=linux/amd64 make all WHAT=cmd/kubelet GOFLAGS=-v GOGCFLAGS="-N -l"
# build all
KUBE_BUILD_PLATFORMS=linux/amd64 make all GOFLAGS=-v GOGCFLAGS="-N -l"
# build api-server
KUBE_BUILD_PLATFORMS=linux/amd64 make all WHAT=cmd/kube-apiserver GOFLAGS=-v GOGCFLAGS="-N -l"

运行脚本:

lijian@vma:/app/web/gopath/src/k8s.io/kubernetes-1.23.5$ cat run.sh 
# 1. install etcd
export PATH="/app/web/gopath/src/k8s.io/kubernetes-1.23.5/third_party/etcd:${PATH}"

# 2. run k8s locally (skip compile)
./hack/local-up-cluster.sh -O

# 3. compile and run k8s locally
PATH=$PATH KUBERNETES_PROVIDER=local hack/local-up-cluster.sh

# 4. get k8s nodes
# ./cluster/kubectl.sh get nodes

Logo

开源、云原生的融合云平台

更多推荐