介绍:k8s集群系统的各组件需要使用 TLS证书对通信进行加密,本文档使用 CloudFlarePKI工具集 cfssl来生成 Certificate Authority(CA)和其他证书。

管理集群中的TLS

前言

每个Kubernetes集群都有一个集群根证书颁发机构(CA)。 集群中的组件通常使用CA来验证API server的证书,由API服务器验证kubelet客户端证书等。为了支持这一点,CA证书包被分发到集群中的每个节点,并作为一个secret附加分发到默认service account上。 或者,你的workload可以使用此CA建立信任。 你的应用程序可以使用类似于ACME草案的协议,使用certificates.k8s.io API请求证书签名。

集群中的TLS信任

让Pod中运行的应用程序信任集群根CA通常需要一些额外的应用程序配置。 您将需要将CA证书包添加到TLS客户端或服务器信任的CA证书列表中。 例如,您可以使用golang TLS配置通过解析证书链并将解析的证书添加到tls.Config结构中的Certificates字段中,CA证书捆绑包将使用默认服务账户自动加载到pod中,路径为/var/run/secrets/kubernetes.io/serviceaccount/ca.crt。 如果您没有使用默认服务账户,请请求集群管理员构建包含您有权访问使用的证书包的configmap。

集群部署

环境规划

软件版本
Linux操作系统CentOS Linux release 7.6.1810 (Core)
Kubernetes1.14.2
Docker18.06.1-ce
Etcd3.3.1
角色IP组件推荐配置
k8s-master172.16.4.12kube-apiserver
kube-controller-manager
kube-scheduler etcd
8core和16GB内存
k8s-node1172.16.4.13kubelet
kube-proxy
docker
flannel
etcd
根据需要运行的容器数量进行配置
k8s-node2172.16.4.14kubelet
kube-proxy
docker
flannel
etcd
根据需要运行的容器数量进行配置
组件使用的证书
etcdca.pem, server.pem, server-key.pem
kube-apiserverca.pem, server.pem, server-key.pem
kubeletca.pem, ca-key.pem
kube-proxyca.pem, kube-proxy.pem, kube-proxy-key.pem
kubectlca.pem, admin.pem, admin-key.pem
kube-controller-managerca.pem, ca-key.pem
flannelca.pem, server.pem, server-key.pem

环境准备

以下操作需要在master节点和各Node节点上执行

  • 准备必要可用的软件包(非必须操作)
# 安装net-tools,可以使用ping,ifconfig等命令
yum install -y net-tools

# 安装curl,telnet命令
yum install -y curl telnet

# 安装vim编辑器
yum install -y vim

# 安装wget下载命令
yum install -y wget

# 安装lrzsz工具,可以直接拖拽文件到Xshell中上传文件到服务器或下载文件到本地。
yum -y install lrzsz
  • 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
  • 关闭selinux
sed -i 's/enforcing/disabled' /etc/selinux/config
setenforce 0
# 或者进入到/etc/selinux/config将以下字段设置并重启生效:
SELINUX=disabled
  • 关闭swap
swapoff -a # 临时
vim /etc/fstab #永久
  • 确保net.bridge.bridge-nf-call-iptables在sysctl配置为1:
$ cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward =1
net.bridge.bridge-nf-call-ip6tables =1
net.bridge.bridge-nf-call-iptables =1
EOF
$ sysctl --system
  • 添加主机名与IP对应关系(master和node节点都需要配置)
$ vim /etc/hosts
172.16.4.12  k8s-master
172.16.4.13  k8s-node1
172.16.4.14  k8s-node2
  • 同步时间
# yum install ntpdate -y
# ntpdate ntp.api.bz 

 k8s需要容器运行时(Container Runtime Interface,CRI)的支持,目前官方支持的容器运行时包括:Docker、Containerd、CRI-O和frakti。此处以Docker作为容器运行环境,推荐版本为Docker CE 18.06 或 18.09.

  • 安装Docker
# 为Docker配置阿里云源,注意是在/etc/yum.repos.d目录执行下述命令。
[root@k8s-master yum.repos.d]# wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# update形成缓存,并且列出可用源,发现出现docker-ce源。
[root@k8s-master yum.repos.d]# yum update && yum repolist
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.lzu.edu.cn
 * extras: mirrors.nwsuaf.edu.cn
 * updates: mirror.lzu.edu.cn
docker-ce-stable                                                                                                  | 3.5 kB  00:00:00     
(1/2): docker-ce-stable/x86_64/updateinfo                                                                         |   55 B  00:00:00     
(2/2): docker-ce-stable/x86_64/primary_db                                                                         |  28 kB  00:00:00     
No packages marked for update
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.lzu.edu.cn
 * extras: mirrors.nwsuaf.edu.cn
 * updates: mirror.lzu.edu.cn
repo id                                                         repo name                                                          status
base/7/x86_64                                                   CentOS-7 - Base                                                    10,019
docker-ce-stable/x86_64                                         Docker CE Stable - x86_64                                              43
extras/7/x86_64                                                 CentOS-7 - Extras                                                     409
updates/7/x86_64                                                CentOS-7 - Updates                                                  2,076
repolist: 12,547

# 列出可用的docker-ce版本,推荐使用18.06或18.09的稳定版。
yum list docker-ce.x86_64 --showduplicates | sort -r
# 正式安装docker,此处以docker-ce-18.06.3.ce-3.el7为例。推荐第2种方式。
yum -y install docker-ce-18.06.3.ce-3.el7
# 在此处可能会报错:Delta RPMs disabled because /usr/bin/applydeltarpm not installed.采用如下命令解决。
yum provides '*/applydeltarpm'
yum install deltarpm -y
# 然后重新执行安装命令
yum -y install docker-ce-18.06.3.ce-3.el7
# 安装完成设置docker开机自启动。
systemctl enable docker

注意:以下操作都在 master 节点即 172.16.4.12 这台主机上执行,证书只需要创建一次即可,以后在向集群中添加新节点时只要将 /etc/kubernetes/ 目录下的证书拷贝到新节点上即可。

创建TLS证书和秘钥

  • 采用二进制源码包安装CFSSL
# 首先创建存放证书的位置
$ mkdir ssl && cd ssl
# 下载用于生成证书的
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
# 用于将证书的json文本导入
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
# 查看证书信息
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
# 修改文件,使其具备执行权限
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
# 将文件移到/usr/local/bin/cfssl
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo
# 如果是普通用户,可能需要将环境变量设置下
export PATH=/usr/local/bin:$PATH

创建CA(Certificate Authority)

注意以下命令,仍旧在/root/ssl文件目录下执行。

  1. 创建CA配置文件
# 生成一个默认配置
$ cfssl print-defaults config > config.json
# 生成一个默认签发证书的配置
$ cfssl print-defaults csr > csr.json
# 根据config.json文件的格式创建如下的ca-config.json文件,其中过期时间设置成了 87600h
cat > ca-config.json <<EOF
{
  "signing": {
    "default": {
      "expiry": "87600h"
    },
    "profiles": {
      "kubernetes": {
         "expiry": "87600h",
         "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ]
      }
    }
  }
}
EOF

字段说明

  • ca-config.json:可以定义多个 profiles,分别指定不同的过期时间、使用场景等参数;后续在签名证书时使用某个 profile;
  • signing:表示该证书可用于签名其它证书;生成的 ca.pem 证书中 CA=TRUE
  • server auth:表示client可以用该 CA 对server提供的证书进行验证;
  • client auth:表示server可以用该CA对client提供的证书进行验证;
  1. 创建CA证书签名请求
# 创建ca-csr.json文件,内容如下
cat > ca-csr.json <<EOF
{
    "CN": "kubernetes",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "Beijing",
            "ST": "Beijing",
      	    "O": "k8s",
            "OU": "System"
        }
    ],
      "ca": {
    	"expiry": "87600h"
    }
}
EOF

  • “CN”:Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name);浏览器使用该字段验证网站是否合法;
  • “O”:Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group);
  1. 生成CA证书和私钥
[root@k8s-master ~]# cfssl gencert -initca ca-csr.json | cfssljson -bare ca
2019/06/12 11:08:53 [INFO] generating a new CA key and certificate from CSR
2019/06/12 11:08:53 [INFO] generate received request
2019/06/12 11:08:53 [INFO] received CSR
2019/06/12 11:08:53 [INFO] generating key: rsa-2048
2019/06/12 11:08:53 [INFO] encoded CSR
2019/06/12 11:08:53 [INFO] signed certificate with serial number 708489059891717538616716772053407287945320812263
# 此时/root下应该有以下四个文件。
[root@k8s-master ssl]# ls
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem

  1. 创建Kubernetes证书

创建Kubernetes证书签名请求文件server-csr.json(kubernetes-csr.json),并将受信任的IP修改添加到hosts,比如我的三个节点的IP为:172.16.4.12 172.16.4.13 172.16.4.14

$ cat > server-csr.json <<EOF
{
    "CN": "kubernetes",
    "hosts": [
      "127.0.0.1",
      "172.16.4.12",
      "172.16.4.13",
      "172.16.4.14",
      "10.10.10.1",
      "kubernetes",
      "kubernetes.default",
      "kubernetes.default.svc",
      "kubernetes.default.svc.cluster",
      "kubernetes.default.svc.cluster.local"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "BeiJing",
            "ST": "BeiJing",
            "O": "k8s",
            "OU": "System"
        }
    ]
}
EOF
# 正式生成Kubernetes证书和私钥
[root@k8s-master ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
2019/06/12 12:00:45 [INFO] generate received request
2019/06/12 12:00:45 [INFO] received CSR
2019/06/12 12:00:45 [INFO] generating key: rsa-2048
2019/06/12 12:00:45 [INFO] encoded CSR
2019/06/12 12:00:45 [INFO] signed certificate with serial number 276381852717263457656057670704331293435930586226
2019/06/12 12:00:45 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
# 查看生成的server.pem和server-key.pem
[root@k8s-master ssl]# ls server*
server.csr  server-csr.json  server-key.pem  server.pem

  • 如果 hosts 字段不为空则需要指定授权使用该证书的 IP 或域名列表,由于该证书后续被 etcd 集群和 kubernetes master 集群使用,所以上面分别指定了 etcd集群、kubernetes master 集群的主机 IP 和 kubernetes 服务的服务 IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.10.10.1)。
  • 这是最小化安装的kubernetes集群,包括一个私有镜像仓库,三个节点的kubernetes集群,以上物理节点的IP也可以更换为主机名。
  1. 创建admin证书

创建admin证书签名请求文件,admin-csr.json:

cat > admin-csr.json <<EOF
{
  "CN": "admin",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "L": "BeiJing",
      "ST": "BeiJing",
      "O": "system:masters",
      "OU": "System"
    }
  ]
}
EOF

  • 后续 kube-apiserver 使用 RBAC 对客户端(如 kubeletkube-proxyPod)请求进行授权;
  • kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver所有 API的权限;
  • O 指定该证书的 Group 为 system:masterskubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;

注意:这个admin 证书,是将来生成管理员用的kube config 配置文件用的,现在我们一般建议使用RBAC 来对kubernetes 进行角色权限控制, kubernetes 将证书中的CN 字段 作为User, O 字段作为 Group(具体参考 Kubernetes中的用户与身份认证授权中 X509 Client Certs 一段)。

生成admin证书和私钥

[root@k8s-master ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
2019/06/12 14:52:32 [INFO] generate received request
2019/06/12 14:52:32 [INFO] received CSR
2019/06/12 14:52:32 [INFO] generating key: rsa-2048
2019/06/12 14:52:33 [INFO] encoded CSR
2019/06/12 14:52:33 [INFO] signed certificate with serial number 491769057064087302830652582150890184354925110925
2019/06/12 14:52:33 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
#查看生成的证书和私钥
[root@k8s-master ssl]# ls admin*
admin.csr  admin-csr.json  admin-key.pem  admin.pem

  1. 创建kube-proxy证书

创建 kube-proxy 证书签名请求文件 kube-proxy-csr.json,让它携带证书访问集群:

cat > kube-proxy-csr.json <<EOF
{
  "CN": "system:kube-proxy",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "L": "BeiJing",
      "ST": "BeiJing",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF

  • CN 指定该证书的 User 为 system:kube-proxy
  • kube-apiserver 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

生成 kube-proxy 客户端证书和私钥

[root@k8s-master ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy && ls kube-proxy*
2019/06/12 14:58:09 [INFO] generate received request
2019/06/12 14:58:09 [INFO] received CSR
2019/06/12 14:58:09 [INFO] generating key: rsa-2048
2019/06/12 14:58:09 [INFO] encoded CSR
2019/06/12 14:58:09 [INFO] signed certificate with serial number 175491367066700423717230199623384101585104107636
2019/06/12 14:58:09 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
kube-proxy.csr  kube-proxy-csr.json  kube-proxy-key.pem  kube-proxy.pem

  1. 校验证书

以server证书为例

使用openssl命令

[root@k8s-master ssl]# openssl x509  -noout -text -in  server.pem
......
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=CN, ST=Beijing, L=Beijing, O=k8s, OU=System, CN=kubernetes
        Validity
            Not Before: Jun 12 03:56:00 2019 GMT
            Not After : Jun  9 03:56:00 2029 GMT
        Subject: C=CN, ST=BeiJing, L=BeiJing, O=k8s, OU=System, CN=kubernetes
        ......
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier: 
                E9:99:37:41:CC:E9:BA:9A:9F:E6:DE:4A:3E:9F:8B:26:F7:4E:8F:4F
            X509v3 Authority Key Identifier: 
                keyid:CB:97:D5:C3:5F:8A:EB:B5:A8:9D:39:DE:5F:4F:E0:10:8E:4C:DE:A2

            X509v3 Subject Alternative Name: 
                DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster, DNS:kubernetes.default.svc.cluster.local, IP Address:127.0.0.1, IP Address:172.16.4.12, IP Address:172.16.4.13, IP Address:172.16.4.14, IP Address:10.10.10.1
    ......

  • 确认 Issuer 字段的内容和 ca-csr.json 一致;
  • 确认 Subject 字段的内容和 server-csr.json 一致;
  • 确认 X509v3 Subject Alternative Name 字段的内容和 server-csr.json 一致;
  • 确认 X509v3 Key Usage、Extended Key Usage 字段的内容和 ca-config.json 中 ``kubernetes profile` 一致;

使用 cfssl-certinfo 命令

[root@k8s-master ssl]# cfssl-certinfo -cert server.pem
{
  "subject": {
    "common_name": "kubernetes",
    "country": "CN",
    "organization": "k8s",
    "organizational_unit": "System",
    "locality": "BeiJing",
    "province": "BeiJing",
    "names": [
      "CN",
      "BeiJing",
      "BeiJing",
      "k8s",
      "System",
      "kubernetes"
    ]
  },
  "issuer": {
    "common_name": "kubernetes",
    "country": "CN",
    "organization": "k8s",
    "organizational_unit": "System",
    "locality": "Beijing",
    "province": "Beijing",
    "names": [
      "CN",
      "Beijing",
      "Beijing",
      "k8s",
      "System",
      "kubernetes"
    ]
  },
  "serial_number": "276381852717263457656057670704331293435930586226",
  "sans": [
    "kubernetes",
    "kubernetes.default",
    "kubernetes.default.svc",
    "kubernetes.default.svc.cluster",
    "kubernetes.default.svc.cluster.local",
    "127.0.0.1",
    "172.16.4.12",
    "172.16.4.13",
    "172.16.4.14",
    "10.10.10.1"
  ],
  "not_before": "2019-06-12T03:56:00Z",
  "not_after": "2029-06-09T03:56:00Z",
  "sigalg": "SHA256WithRSA",
  ......
}
  1. 分发证书

将生成的证书和秘钥文件(后缀名为.pem)拷贝到所有机器的 /etc/kubernetes/ssl目录下备用;

[root@k8s-master ssl]# mkdir -p /etc/kubernetes/ssl
[root@k8s-master ssl]# cp *.pem /etc/kubernetes/ssl
[root@k8s-master ssl]# ls /etc/kubernetes/ssl/
admin-key.pem  admin.pem  ca-key.pem  ca.pem  kube-proxy-key.pem  kube-proxy.pem  server-key.pem  server.pem
# 留下pem文件,删除其余无用文件(非必须操作,可以不执行)
ls | grep -v pem |xargs -i rm {}

创建kubeconfig文件

以下命令在master节点运行,没有指定运行目录,则默认是用户家目录,root用户则在/root下执行。

下载kubectl

注意请下载对应的Kubernetes版本的安装包。

# 下述网站,如果访问不了网站,请移步百度云下载:
wget https://dl.k8s.io/v1.14.3/kubernetes-client-linux-amd64.tar.gz
tar -xzvf kubernetes-client-linux-amd64.tar.gz
cp kubernetes/client/bin/kube* /usr/bin/
chmod a+x /usr/bin/kube*
创建kubectl kubeconfig文件
# 172.16.4.12是master节点的IP,注意更改。 
# 创建kubeconfig 然后需要指定k8s的api的https的访问入口 
export KUBE_APISERVER="https://172.16.4.12:6443"
# 设置集群参数
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER}
# 设置客户端认证参数
kubectl config set-credentials admin \
  --client-certificate=/etc/kubernetes/ssl/admin.pem \
  --embed-certs=true \
  --client-key=/etc/kubernetes/ssl/admin-key.pem
# 设置上下文参数
kubectl config set-context kubernetes \
  --cluster=kubernetes \
  --user=admin
# 设置默认上下文
kubectl config use-context kubernetes
  • admin.pem 证书 OU 字段值为 system:masterskube-apiserver 预定义的 RoleBinding cluster-admin 将 Group system:masters 与 Role cluster-admin绑定,该 Role 授予了调用kube-apiserver 相关 API 的权限;
  • 生成的 kubeconfig 被保存到 ~/.kube/config 文件;

注意:~/.kube/config文件拥有对该集群的最高权限,请妥善保管。

kubeletkube-proxy 等 Node 机器上的进程与 Master 机器的 kube-apiserver 进程通信时需要认证和授权;

以下操作只需要在master节点上执行,生成的*.kubeconfig文件可以直接拷贝到node节点的/etc/kubernetes目录下。

创建TLS Bootstrapping token

Token auth file

Token可以是任意的包含128 bit的字符串,可以使用安全的随机数发生器生成。

export BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ')
cat > token.csv <<EOF
${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:kubelet-bootstrap"
EOF

后三行是一句,直接复制上面的脚本运行即可。

注意:在进行后续操作前请检查 token.csv 文件,确认其中的 ${BOOTSTRAP_TOKEN}环境变量已经被真实的值替换。

BOOTSTRAP_TOKEN 将被写入到 kube-apiserver 使用的 token.csv 文件和 kubelet 使用的 bootstrap.kubeconfig 文件,如果后续重新生成了 BOOTSTRAP_TOKEN,则需要

  1. 更新 token.csv 文件,分发到所有机器 (master 和 node)的 /etc/kubernetes/ 目录下,分发到node节点上非必需;
  2. 重新生成 bootstrap.kubeconfig 文件,分发到所有 node 机器的 /etc/kubernetes/ 目录下;
  3. 重启 kube-apiserver 和 kubelet 进程;
  4. 重新 approve kubelet 的 csr 请求;
cp token.csv /etc/kubernetes/

创建kubelet bootstrapping kubeconfig文件

执行下面的命令时需要先安装kubectl命令

# 在执行之前,可以先安装kubectl 自动补全命令工具。
yum install -y bash-completion
source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)

# 进到执行目录/etc/kubernetes下。
cd /etc/kubernetes
export KUBE_APISERVER="https://172.16.4.12:6443"

# 设置集群参数
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=bootstrap.kubeconfig

# 设置客户端认证参数
kubectl config set-credentials kubelet-bootstrap \
  --token=${BOOTSTRAP_TOKEN} \
  --kubeconfig=bootstrap.kubeconfig

# 设置上下文参数
kubectl config set-context default \
  --cluster=kubernetes \
  --user=kubelet-bootstrap \
  --kubeconfig=bootstrap.kubeconfig

# 设置默认上下文
kubectl config use-context default --kubeconfig=bootstrap.kubeconfig

  • --embed-certstrue 时表示将 certificate-authority 证书写入到生成的 bootstrap.kubeconfig 文件中;
  • 设置客户端认证参数时没有指定秘钥和证书,后续由 kube-apiserver 自动生成;
创建kube-proxy kubeconfig文件
export KUBE_APISERVER="https://172.16.4.12:6443"
# 设置集群参数
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=kube-proxy.kubeconfig
# 设置客户端认证参数
kubectl config set-credentials kube-proxy \
  --client-certificate=/etc/kubernetes/ssl/kube-proxy.pem \
  --client-key=/etc/kubernetes/ssl/kube-proxy-key.pem \
  --embed-certs=true \
  --kubeconfig=kube-proxy.kubeconfig
# 设置上下文参数
kubectl config set-context default \
  --cluster=kubernetes \
  --user=kube-proxy \
  --kubeconfig=kube-proxy.kubeconfig
# 设置默认上下文
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig

  • 设置集群参数和客户端认证参数时 --embed-certs 都为 true,这会将 certificate-authorityclient-certificateclient-key 指向的证书文件内容写入到生成的 kube-proxy.kubeconfig 文件中;
  • kube-proxy.pem 证书中 CN 为 system:kube-proxykube-apiserver 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
分发kubeconfig文件

将两个 kubeconfig 文件分发到所有 Node 机器的 /etc/kubernetes/ 目录下:

# 现在可以把其他节点加入互信,首先需要生成证书,三次回车即可。
ssh-keygen
# 查看生成的证书
ls /root/.ssh/
id_rsa  id_rsa.pub
# 将生成的证书拷贝到node1和node2
ssh-copy-id root@172.16.4.13
# 输入节点用户的密码即可访问。同样方式加入node2为互信。
# 把kubeconfig文件拷贝到node节点的/etc/kubernetes,该目录需要事先手动创建好。
scp bootstrap.kubeconfig kube-proxy.kubeconfig root@172.16.4.13:/etc/kubernetes
scp bootstrap.kubeconfig kube-proxy.kubeconfig root@172.16.4.14:/etc/kubernetes

创建 ETCD HA集群

etcd服务作为k8s集群的主数据库,在安装k8s各服务之前需要首先安装和启动。kuberntes 系统使用 etcd 存储所有数据,本文档介绍部署一个三节点高可用 etcd 集群的步骤,这三个节点复用 kubernetes master 机器,分别命名为k8s-masterk8s-node1k8s-node2

角色IP
k8s-master172.16.4.12
k8s-node1172.16.4.13
k8s-node2172.16.4.14
TLS认证文件

需要为 etcd 集群创建加密通信的 TLS 证书,这里复用以前创建的 kubernetes 证书:

# 将/root/ssl下的ca.pem, server-key.pem, server.pem复制到/etc/kubernetes/ssl
cp ca.pem server-key.pem server.pem /etc/kubernetes/ssl

  • kubernetes 证书的 hosts 字段列表中包含上面三台机器的 IP,否则后续证书校验会失败;
下载二进制文件

二进制包下载地址:此文最新为etcd-v3.3.13,读者可以到https://github.com/coreos/etcd/releases页面下载最新版本的二进制文件。

wget https://github.com/etcd-io/etcd/releases/download/v3.3.13/etcd-v3.3.13-linux-amd64.tar.gz
tar zxvf etcd-v3.3.13-linux-amd64.tar.gz
mv etcd-v3.3.13-linux-amd64/etcd* /usr/local/bin

或者直接使用yum命令安装:

yum install etcd

注意:若使用yum安装,默认etcd命令将在/usr/bin目录下,注意修改下面的etcd.service文件中的启动命令地址为/usr/bin/etcd

创建etcd的数据目录
mkdir -p /var/lib/etcd/default.etcd
创建etcd的systemd unit文件

在/usr/lib/systemd/system/目录下创建文件etcd.service,内容如下。注意替换IP地址为你自己的etcd集群的主机IP。

[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \
  --name ${ETCD_NAME} \
  --cert-file=/etc/kubernetes/ssl/server.pem \
  --key-file=/etc/kubernetes/ssl/server-key.pem \
  --peer-cert-file=/etc/kubernetes/ssl/server.pem \
  --peer-key-file=/etc/kubernetes/ssl/server-key.pem \
  --trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
  --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
  --initial-advertise-peer-urls ${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
  --listen-peer-urls=${ETCD_LISTEN_PEER_URLS} \
  --listen-client-urls=${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
  --advertise-client-urls=${ETCD_ADVERTISE_CLIENT_URLS} \
  --initial-cluster-token=${ETCD_INITIAL_CLUSTER_TOKEN} \
  --initial-cluster etcd-master=https://172.16.4.12:2380,etcd-node1=https://172.16.4.13:2380,etcd-node2=https://172.16.4.14:2380 \
  --initial-cluster-state=new \
  --data-dir=${ETCD_DATA_DIR}
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
  • 指定 etcd 的工作目录为 /var/lib/etcd,数据目录为 /var/lib/etcd需在启动服务前创建这个目录,否则启动服务的时候会报错“Failed at step CHDIR spawning /usr/bin/etcd: No such file or directory”;
  • 为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书(trusted-ca-file);
  • 创建 server.pem 证书时使用的 server-csr.json 文件的 hosts 字段包含所有 etcd 节点的IP,否则证书校验会出错;
  • --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;
创建etcd的环境变量配置文件/etc/etcd/etcd.conf
mkdir -p  /etc/etcd
touch etcd.conf

写入内容如下:

# [member]
ETCD_NAME=etcd-master
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://172.16.4.12:2380"
ETCD_LISTEN_CLIENT_URLS="https://172.16.4.12:2379"

#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.16.4.12:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://172.16.4.12:2379"


这是172.16.4.12节点的配置,其他两个etcd节点只要将上面的IP地址改成相应节点的IP地址即可。ETCD_NAME换成对应节点的etcd-node1 etcd-node2。

部署node节点的etcd
# 1. 从master节点传送TLS认证文件到各节点。注意需要在各节点上事先创建/etc/kubernetes/ssl目录。
scp /etc/kubernetes/ssl/*.pem root@172.16.4.13:/etc/kubernetes/ssl/
scp /etc/kubernetes/ssl/*.pem root@172.16.4.14:/etc/kubernetes/ssl/

# 2. 把master节点的etcd和etcdctl命令直接传到各节点上,
scp /usr/local/bin/etcd* root@172.16.4.13:/usr/local/bin/
scp /usr/local/bin/etcd* root@172.16.4.14:/usr/local/bin/

# 3. 把etcd配置文件上传至各node节点上。注意事先在各节点上创建好/etc/etcd目录。
scp /etc/etcd/etcd.conf root@172.16.4.13:/etc/etcd/
scp /etc/etcd/etcd.conf root@172.16.4.14:/etc/etcd/
# 4. 需要修改/etc/etcd/etcd.conf的相应参数。以k8s-node1(IP:172.16.4.13)为例:
# [member]
ETCD_NAME=etcd-node1
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_PEER_URLS="https://172.16.4.13:2380"
ETCD_LISTEN_CLIENT_URLS="https://172.16.4.13:2379"

#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.16.4.13:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://172.16.4.13:2379"
# 上述文件主要是修改ETCD_NAME和对应的IP为节点IP即可。同样修改node2的配置文件。

# 5. 把/usr/lib/systemd/system/etcd.service的etcd服务配置文件上传至各节点。
scp /usr/lib/systemd/system/etcd.service root@172.16.4.13:/usr/lib/systemd/system/
scp /usr/lib/systemd/system/etcd.service root@172.16.4.14:/usr/lib/systemd/system/

启动服务
systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl status etcd

在所有的 kubernetes master 节点重复上面的步骤,直到所有机器的 etcd 服务都已启动。

注意:如果日志中出现连接异常信息,请确认所有节点防火墙是否开放2379,2380端口。 以centos7为例:

firewall-cmd --zone=public --add-port=2380/tcp --permanent
firewall-cmd --zone=public --add-port=2379/tcp --permanent
firewall-cmd --reload

验证服务

在任一 kubernetes master 机器上执行如下命令:

[root@k8s-master ~]# etcdctl \
> --ca-file=/etc/kubernetes/ssl/ca.pem \
> --cert-file=/etc/kubernetes/ssl/server.pem \
> --key-file=/etc/kubernetes/ssl/server-key.pem \
> cluster-health
member 287080ba42f94faf is healthy: got healthy result from https://172.16.4.13:2379
member 47e558f4adb3f7b4 is healthy: got healthy result from https://172.16.4.12:2379
member e531bd3c75e44025 is healthy: got healthy result from https://172.16.4.14:2379
cluster is healthy

结果最后一行为 cluster is healthy 时表示集群服务正常。

部署Master节点

kubernetes master 节点包含的组件:

  • kube-apiserver
  • kube-scheduler
  • kube-controller-manager

目前这三个组件需要部署在同一台机器上。

  • kube-schedulerkube-controller-managerkube-apiserver 三者的功能紧密相关;
  • 同时只能有一个 kube-schedulerkube-controller-manager 进程处于工作状态,如果运行多个,则需要通过选举产生一个 leader;

TLS证书文件

以下pem证书文件我们在”创建TLS证书和秘钥“这一步中已经创建过了,token.csv文件在“创建kubeconfig文件”的时候创建。我们再检查一下。

[root@k8s-master ~]# ls /etc/kubernetes/ssl/
admin-key.pem  admin.pem  ca-key.pem  ca.pem  kube-proxy-key.pem  kube-proxy.pem  server-key.pem  server.pem

下载最新版本的二进制文件

从https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md页面clientserver tarball 文件server的 tarball kubernetes-server-linux-amd64.tar.gz 已经包含了 client(kubectl) 二进制文件,所以不用单独下载kubernetes-client-linux-amd64.tar.gz文件;

wget https://dl.k8s.io/v1.14.3/kubernetes-server-linux-amd64.tar.gz
# 如果官网访问不到,可以移步百度云:链接:https://pan.baidu.com/s/1G6e981Q48mMVWD9Ho_j-7Q 提取码:uvc1 下载。
tar -xzvf kubernetes-server-linux-amd64.tar.gz
cd kubernetes
tar -xzvf  kubernetes-src.tar.gz

将二进制文件拷贝到指定路径

[root@k8s-master kubernetes]# cp -r server/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubectl,kube-proxy,kubelet} /usr/local/bin/

配置和启动kube-apiserver

(1)创建kube-apiserver的service配置文件

service配置文件/usr/lib/systemd/system/kube-apiserver.service内容:

[Unit]
Description=Kubernetes API Service
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
After=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/apiserver
ExecStart=/usr/local/bin/kube-apiserver \
        $KUBE_LOGTOSTDERR \
        $KUBE_LOG_LEVEL \
        $KUBE_ETCD_SERVERS \
        $KUBE_API_ADDRESS \
        $KUBE_API_PORT \
        $KUBELET_PORT \
        $KUBE_ALLOW_PRIV \
        $KUBE_SERVICE_ADDRESSES \
        $KUBE_ADMISSION_CONTROL \
        $KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

(2) 创建/etc/kubernetes/config文件内容为:

###
# kubernetes system config
#
# The following values are used to configure various aspects of all
# kubernetes services, including
#
#   kube-apiserver.service
#   kube-controller-manager.service
#   kube-scheduler.service
#   kubelet.service
#   kube-proxy.service
# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"

# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"

# Should this cluster be allowed to run privileged docker containers
KUBE_ALLOW_PRIV="--allow-privileged=true"

# How the controller-manager, scheduler, and proxy find the apiserver
KUBE_MASTER="--master=http://172.16.4.12:8080"

该配置文件同时被kube-apiserver、kube-controller-manager、kube-scheduler、kubelet、kube-proxy使用。

apiserver配置文件/etc/kubernetes/apiserver内容为:

###
### kubernetes system config
###
### The following values are used to configure the kube-apiserver
###
##
### The address on the local server to listen to.
KUBE_API_ADDRESS="--advertise-address=172.16.4.12 --bind-address=172.16.4.12 --insecure-bind-address=172.16.4.12"
##
### The port on the local server to listen on.
##KUBE_API_PORT="--port=8080"
##
### Port minions listen on
##KUBELET_PORT="--kubelet-port=10250"
##
### Comma separated list of nodes in the etcd cluster
KUBE_ETCD_SERVERS="--etcd-servers=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379"
##
### Address range to use for services
KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.10.10.0/24"
##
### default admission control policies
KUBE_ADMISSION_CONTROL="--admission-control=ServiceAccount,NamespaceLifecycle,NamespaceExists,LimitRanger,ResourceQuota"
##
### Add your own!
KUBE_API_ARGS="--authorization-mode=RBAC \
--runtime-config=rbac.authorization.k8s.io/v1beta1 \
--kubelet-https=true \
--enable-bootstrap-token-auth \
--token-auth-file=/etc/kubernetes/token.csv \
--service-node-port-range=30000-50000 \
--tls-cert-file=/etc/kubernetes/ssl/server.pem \
--tls-private-key-file=/etc/kubernetes/ssl/server-key.pem \
--client-ca-file=/etc/kubernetes/ssl/ca.pem \
--service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
--etcd-cafile=/etc/kubernetes/ssl/ca.pem \
--etcd-certfile=/etc/kubernetes/ssl/server.pem \
--etcd-keyfile=/etc/kubernetes/ssl/server-key.pem \
--enable-swagger-ui=true \
--apiserver-count=3 \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/var/lib/audit.log \
--event-ttl=1h"

  • 如果中途修改过--service-cluster-ip-range地址,则必须将default命名空间的kubernetes的service给删除,使用命令:kubectl delete service kubernetes,然后系统会自动用新的ip重建这个service,不然apiserver的log有报错the cluster IP x.x.x.x for service kubernetes/default is not within the service CIDR x.x.x.x/24; please recreate
  • --authorization-mode=RBAC 指定在安全端口使用 RBAC 授权模式,拒绝未通过授权的请求;
  • kube-scheduler、kube-controller-manager 一般和 kube-apiserver 部署在同一台机器上,它们使用非安全端口和 kube-apiserver通信;
  • kubelet、kube-proxy、kubectl 部署在其它 Node 节点上,如果通过安全端口访问 kube-apiserver,则必须先通过 TLS 证书认证,再通过 RBAC 授权;
  • kube-proxy、kubectl 通过在使用的证书里指定相关的 User、Group 来达到通过 RBAC 授权的目的;
  • 如果使用了 kubelet TLS Boostrap 机制,则不能再指定 --kubelet-certificate-authority--kubelet-client-certificate--kubelet-client-key 选项,否则后续 kube-apiserver 校验 kubelet 证书时出现 ”x509: certificate signed by unknown authority“ 错误;
  • --admission-control 值必须包含 ServiceAccount
  • --bind-address 不能为 127.0.0.1
  • runtime-config配置为rbac.authorization.k8s.io/v1beta1,表示运行时的apiVersion;
  • --service-cluster-ip-range 指定 Service Cluster IP 地址段,该地址段不能路由可达;
  • 缺省情况下 kubernetes 对象保存在 etcd/registry 路径下,可以通过 --etcd-prefix 参数进行调整;
  • 如果需要开通http的无认证的接口,则可以增加以下两个参数:--insecure-port=8080 --insecure-bind-address=127.0.0.1。注意,生产上不要绑定到非127.0.0.1的地址上。

注意:完整 unit 见 kube-apiserver.service可以根据自身集群需求修改参数。

(3)启动kube-apiserver

systemctl daemon-reload
systemctl enable kube-apiserver
systemctl start kube-apiserver
systemctl status kube-apiserver

配置和启动kube-controller-manager

(1)创建kube-controller-manager的service配置文件

文件路径/usr/lib/systemd/system/kube-controller-manager.service

[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/controller-manager
ExecStart=/usr/local/bin/kube-controller-manager \
        $KUBE_LOGTOSTDERR \
        $KUBE_LOG_LEVEL \
        $KUBE_MASTER \
        $KUBE_CONTROLLER_MANAGER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

(2)配置文件/etc/kubernetes/controller-manager

###
# The following values are used to configure the kubernetes controller-manager

# defaults from config and apiserver should be adequate

# Add your own!
KUBE_CONTROLLER_MANAGER_ARGS="--address=127.0.0.1 \
--service-cluster-ip-range=10.10.10.0/24 \
--cluster-name=kubernetes \
--cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \
--cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem  \
--service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \
--root-ca-file=/etc/kubernetes/ssl/ca.pem \
--leader-elect=true"


  • --service-cluster-ip-range 参数指定 Cluster 中 Service 的CIDR范围,该网络在各 Node 间必须路由不可达,必须和 kube-apiserver 中的参数一致;
  • --cluster-signing-* 指定的证书和私钥文件用来签名为 TLS BootStrap 创建的证书和私钥;
  • --root-ca-file 用来对 kube-apiserver 证书进行校验,指定该参数后,才会在Pod 容器的 ServiceAccount 中放置该 CA 证书文件
  • --address 值必须为 127.0.0.1,kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器;

(3)启动kube-controller-manager

systemctl daemon-reload
systemctl enable kube-controller-manager
systemctl start kube-controller-manager
systemctl status kube-controller-manager

我们启动每个组件后可以通过执行命令kubectl get cs,来查看各个组件的状态;

[root@k8s-master ~]# kubectl get cs
NAME                 STATUS      MESSAGE                                                                                     ERROR
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused   
controller-manager   Healthy     ok                                                                                          
etcd-0               Healthy     {"health":"true"}                                                                           
etcd-2               Healthy     {"health":"true"}                                                                           
etcd-1               Healthy     {"health":"true"}  

配置和启动kube-scheduler

(1)创建kube-scheduler的service的配置文件

文件路径/usr/lib/systemd/system/kube-scheduler.service

[Unit]
Description=Kubernetes Scheduler Plugin
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/scheduler
ExecStart=/usr/local/bin/kube-scheduler \
            $KUBE_LOGTOSTDERR \
            $KUBE_LOG_LEVEL \
            $KUBE_MASTER \
            $KUBE_SCHEDULER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

(2) 配置文件/etc/kubernetes/scheduler

###
# kubernetes scheduler config

# default config should be adequate

# Add your own!
KUBE_SCHEDULER_ARGS="--leader-elect=true --address=127.0.0.1"
  • --address 值必须为 127.0.0.1,因为当前 kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器;

注意:完整 unit 见 kube-scheduler.service可以根据自身集群情况添加参数。

(3) 启动kube-scheduler

systemctl daemon-reload
systemctl enable kube-scheduler
systemctl start kube-scheduler
systemctl status kube-scheduler

验证master节点功能
[root@k8s-master ~]# kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
controller-manager   Healthy   ok                  
scheduler            Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"} 
# 此时发现,ERROR那一栏再没有报错了。

安装flannel网络插件

所有的node节点都需要安装网络插件才能让所有的Pod加入到同一个局域网中,本文是安装flannel网络插件的参考文档。

建议直接使用yum安装flanneld,除非对版本有特殊需求,默认安装的是0.7.1版本的flannel。

(1)安装flannel

# 查看默认安装的flannel版本,下面显示是0.7.1.个人建议安装较新版本。
[root@k8s-master ~]# yum list flannel --showduplicates | sort -r
 * updates: mirror.lzu.edu.cn
Loading mirror speeds from cached hostfile
Loaded plugins: fastestmirror
flannel.x86_64                        0.7.1-4.el7                         extras
 * extras: mirror.lzu.edu.cn
 * base: mirror.lzu.edu.cn
Available Packages

[root@k8s-master ~]# wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz

# 解压文件,可以看到产生flanneld和mk-docker-opts.sh两个可执行文件。
[root@k8s-master ~]# tar zxvf flannel-v0.11.0-linux-amd64.tar.gz
flanneld
mk-docker-opts.sh
README.md
# 把两个可执行文件传至node1和node2中
[root@k8s-master ~]# scp flanneld root@172.16.4.13:/usr/bin/ 
flanneld                                                                                                           100%   34MB  62.9MB/s   00:00    
[root@k8s-master ~]# scp flanneld root@172.16.4.14:/usr/bin/ 
flanneld                                                                                                           100%   34MB 121.0MB/s   00:00    
[root@k8s-master ~]# scp mk-docker-opts.sh root@172.16.4.13:/usr/libexec/flannel
mk-docker-opts.sh                                                                                                      100% 2139     1.2MB/s   00:00  
[root@k8s-master ~]# scp mk-docker-opts.sh root@172.16.4.14:/usr/libexec/flannel
mk-docker-opts.sh                                                                                                      100% 2139     1.1MB/s   00:00  

  • 注意在node节点上一定要实现创建好盛放flanneld和mk-docker-opts.sh的目录。

(2)/etc/sysconfig/flanneld配置文件:

# Flanneld configuration options  

# etcd url location.  Point this to the server where etcd runs
FLANNEL_ETCD_ENDPOINTS="https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379"

# etcd config key.  This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_PREFIX="/kube-centos/network"

# Any additional options that you want to pass
FLANNEL_OPTIONS="-etcd-cafile=/etc/kubernetes/ssl/ca.pem -etcd-certfile=/etc/kubernetes/ssl/server.pem -etcd-keyfile=/etc/kubernetes/ssl/server-key.pem"

(3)创建service配置文件/usr/lib/systemd/system/flanneld.service

[Unit]
Description=Flanneld overlay address etcd agent
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service

[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/flanneld
EnvironmentFile=-/etc/sysconfig/docker-network
ExecStart=/usr/bin/flanneld --ip-masq \
  -etcd-endpoints=${FLANNEL_ETCD_ENDPOINTS} \
  -etcd-prefix=${FLANNEL_ETCD_PREFIX} \
  $FLANNEL_OPTIONS
ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure

[Install]
WantedBy=multi-user.target
RequiredBy=docker.service

  • 注意如果是多网卡(例如vagrant环境),则需要在FLANNEL_OPTIONS中增加指定的外网出口的网卡,例如-iface=eth1

(4)在etcd中创建网络配置

执行下面的命令为docker分配IP地址段。

etcdctl --endpoints=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379 \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/server.pem \
  --key-file=/etc/kubernetes/ssl/server-key.pem
mkdir -p /kube-centos/network

[root@k8s-master network]# etcdctl --endpoints=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379   --ca-file=/etc/kubernetes/ssl/ca.pem   --cert-file=/etc/kubernetes/ssl/server.pem   --key-file=/etc/kubernetes/ssl/server-key.pem   mk /kube-centos/network/config '{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}'

[root@k8s-master network]# etcdctl --endpoints=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379   --ca-file=/etc/kubernetes/ssl/ca.pem   --cert-file=/etc/kubernetes/ssl/server.pem   --key-file=/etc/kubernetes/ssl/server-key.pem   set /kube-centos/network/config '{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}'
{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}

(5)启动flannel

systemctl daemon-reload
systemctl enable flanneld
systemctl start flanneld
systemctl status flanneld

现在查询etcd中的内容可以看到:

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/server.pem \
   --key-file=/etc/kubernetes/ssl/server-key.pem \
   ls /kube-centos/network/subnets
/kube-centos/network/subnets/172.30.20.0-24
/kube-centos/network/subnets/172.30.69.0-24
/kube-centos/network/subnets/172.30.53.0-24

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/server.pem \
   --key-file=/etc/kubernetes/ssl/server-key.pem \
   get /kube-centos/network/config
{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/server.pem \
   --key-file=/etc/kubernetes/ssl/server-key.pem \
   get /kube-centos/network/subnets/172.30.20.0-24
{"PublicIP":"172.16.4.13","BackendType":"vxlan","BackendData":{"VtepMAC":"5e:ef:ff:37:0a:d2"}}

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/etc/kubernetes/ssl/ca.pem \
    --cert-file=/etc/kubernetes/ssl/server.pem \
    --key-file=/etc/kubernetes/ssl/server-key.pem \
   get /kube-centos/network/subnets/172.30.53.0-24
{"PublicIP":"172.16.4.12","BackendType":"vxlan","BackendData":{"VtepMAC":"e2:e6:b9:23:79:a2"}}

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
>     --ca-file=/etc/kubernetes/ssl/ca.pem \
>     --cert-file=/etc/kubernetes/ssl/server.pem \
>     --key-file=/etc/kubernetes/ssl/server-key.pem \
>    get /kube-centos/network/subnets/172.30.69.0-24
{"PublicIP":"172.16.4.14","BackendType":"vxlan","BackendData":{"VtepMAC":"06:0e:58:69:a0:41"}}

同时还可以查看到其他信息:

# 1. 比如可以查看到flannel网络的信息
[root@k8s-master ~]# ifconfig
.......

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.30.53.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::e0e6:b9ff:fe23:79a2  prefixlen 64  scopeid 0x20<link>
        ether e2:e6:b9:23:79:a2  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

.......
# 2. 可以查看到分配的子网的文件。
[root@k8s-master ~]# cat /run/flannel/docker
DOCKER_OPT_BIP="--bip=172.30.53.1/24"
DOCKER_OPT_IPMASQ="--ip-masq=false"
DOCKER_OPT_MTU="--mtu=1450"
DOCKER_NETWORK_OPTIONS=" --bip=172.30.53.1/24 --ip-masq=false --mtu=1450"

(6)将docker应用于flannel

# 需要修改/usr/lib/systemd/system/docker.servce的ExecStart字段,引入上述\$DOCKER_NETWORK_OPTIONS字段,docker.service详细配置见下。

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify

# add by gzr
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network

# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd  $DOCKER_NETWORK_OPTIONS
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

# 重启docker使得配置生效。
[root@k8s-master ~]# systemctl daemon-reload && systemctl restart docker.service
# 再次查看docker和flannel的网络,会发现两者在同一网段
[root@k8s-master ~]# ifconfig 
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.30.53.1  netmask 255.255.255.0  broadcast 172.30.53.255
        ether 02:42:1e:aa:8b:0f  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

......

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.30.53.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::e0e6:b9ff:fe23:79a2  prefixlen 64  scopeid 0x20<link>
        ether e2:e6:b9:23:79:a2  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

.....
# 同理,可以应用到其他各节点上,以node1为例。
[root@k8s-node1 ~]# vim /usr/lib/systemd/system/docker.service 
[root@k8s-node1 ~]# systemctl daemon-reload && systemctl restart docker
[root@k8s-node1 ~]# ifconfig
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.30.20.1  netmask 255.255.255.0  broadcast 172.30.20.255
        inet6 fe80::42:23ff:fe7f:6a70  prefixlen 64  scopeid 0x20<link>
        ether 02:42:23:7f:6a:70  txqueuelen 0  (Ethernet)
        RX packets 18  bytes 2244 (2.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 48  bytes 3469 (3.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

......

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.30.20.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::5cef:ffff:fe37:ad2  prefixlen 64  scopeid 0x20<link>
        ether 5e:ef:ff:37:0a:d2  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

......

veth82301fa: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet6 fe80::6855:cfff:fe99:5143  prefixlen 64  scopeid 0x20<link>
        ether 6a:55:cf:99:51:43  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 7  bytes 586 (586.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


部署node节点

# 把master节点上的flanneld.service文件分发到各node节点上。
scp /usr/lib/systemd/system/flanneld.service root@172.16.4.13:/usr/lib/systemd/system
scp /usr/lib/systemd/system/flanneld.service root@172.16.4.13:/usr/lib/systemd/system
# 重新启动flanneld
systemctl daemon-reload
systemctl enable flanneld
systemctl start flanneld
systemctl status flanneld

配置Docker

不论您使用何种方式安装的flannel,将以下配置加入到/var/lib/systemd/systemc/docker.service中可确保万无一失。

# 待加入内容
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network

# 最终完整的docker.service文件内容如下
[root@k8s-master ~]# cat /usr/lib/systemd/system/docker.service 
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify

# add by gzr
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network

# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

(2)启动docker

安装配置kubelet

(1)检查是否禁用sawp

[root@k8s-master ~]# free
              total        used        free      shared  buff/cache   available
Mem:       32753848      730892    27176072      377880     4846884    31116660
Swap:             0           0           0

  • 或者进入/etc/fstab目录,将swap系统注释掉。

kubelet 启动时向 kube-apiserver 发送 TLS bootstrapping 请求,需要先将 bootstrap token 文件中的 kubelet-bootstrap 用户赋予 system:node-bootstrapper cluster 角色(role), 然后 kubelet 才能有权限创建认证请求(certificate signing requests):

(2)从master节点的/usr/local/bin将kubelet和kube-proxy文件传至各节点

[root@k8s-master ~]# scp /usr/local/bin/kubelet root@172.16.4.13:/usr/local/bin/ 
[root@k8s-master ~]# scp /usr/local/bin/kubelet root@172.16.4.14:/usr/local/bin/ 


(3)在master节点上创建角色。

# 需要在master端创建权限分配角色, 然后在node节点上再重新启动kubelet服务
[root@k8s-master kubernetes]# kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
clusterrolebinding.rbac.authorization.k8s.io/kubelet-bootstrap created

(4)创建kubelet服务

第一种方式:在node节点上创建执行脚本

# 创建kubelet的配置文件和kubelet.service文件,此处采用创建脚本kubelet.sh的一键执行。
#!/bin/bash

NODE_ADDRESS=${1:-"172.16.4.13"}
DNS_SERVER_IP=${2:-"10.10.10.2"}

cat <<EOF >/etc/kubernetes/kubelet

KUBELET_ARGS="--logtostderr=true \\
--v=4 \\
--address=${NODE_ADDRESS} \\
--hostname-override=${NODE_ADDRESS} \\
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\
--bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig \\
--api-servers=172.16.4.12 \\
--cert-dir=/etc/kubernetes/ssl \\
--allow-privileged=true \\
--cluster-dns=${DNS_SERVER_IP} \\
--cluster-domain=cluster.local \\
--fail-swap-on=false \\
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google-containers/pause-amd64:3.0"

EOF

cat <<EOF >/usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
After=docker.service
Requires=docker.service

[Service]
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet \$KUBELET_ARGS
Restart=on-failure
KillMode=process

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable kubelet
systemctl restart kubelet && systemctl status kubelet

2)执行脚本

chmod +x kubelet.sh
./kubelet.sh 172.16.4.14 10.10.10.2
# 同时还可以查看生成的kubelet.service文件
[root@k8s-node2 ~]# cat /usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
After=docker.service
Requires=docker.service

[Service]
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet $KUBELET_ARGS
Restart=on-failure
KillMode=process

[Install]
WantedBy=multi-user.target

  • 注意:在node1上执行kubelet.sh脚本,传入172.16.4.13(node1 IP)和 10.10.10.2(DNS服务器IP)。在其他节点执行脚本时,记得替换相应的参数。

或者采用第二种方式

1)创建kubelet的配置文件/etc/kubernetes/kubelet,内容如下:

###
## kubernetes kubelet (minion) config
#
## The address for the info server to serve on (set to 0.0.0.0 or "" for all interfaces)
KUBELET_ADDRESS="--address=172.16.4.12"
#
## The port for the info server to serve on
#KUBELET_PORT="--port=10250"
#
## You may leave this blank to use the actual hostname
KUBELET_HOSTNAME="--hostname-override=172.16.4.12"
#
## location of the api-server
## COMMENT THIS ON KUBERNETES 1.8+
KUBELET_API_SERVER="--api-servers=http://172.16.4.12:8080"
#
## pod infrastructure container
KUBELET_POD_INFRA_CONTAINER="--pod-infra-container-image=jimmysong/pause-amd64:3.0"
#
## Add your own!
KUBELET_ARGS="--cgroup-driver=systemd \
--cluster-dns=10.10.10.2 \
--bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig \
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
--require-kubeconfig \
--cert-dir=/etc/kubernetes/ssl \
--cluster-domain=cluster.local \
--hairpin-mode promiscuous-bridge \
--serialize-image-pulls=false"

  • 如果使用systemd方式启动,则需要额外增加两个参数--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice
  • --address 不能设置为 127.0.0.1,否则后续 Pods 访问 kubelet 的 API 接口时会失败,因为 Pods 访问的 127.0.0.1 指向自己而不是 kubelet;
  • "--cgroup-driver 配置成 systemd,不要使用cgroup,否则在 CentOS 系统中 kubelet 将启动失败(保持docker和kubelet中的cgroup driver配置一致即可,不一定非使用systemd)。
  • --bootstrap-kubeconfig 指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
  • 管理员通过了 CSR 请求后,kubelet 自动在 --cert-dir 目录创建证书和私钥文件(kubelet-client.crtkubelet-client.key),然后写入 --kubeconfig 文件;
  • 建议在 --kubeconfig 配置文件中指定 kube-apiserver 地址,如果未指定 --api-servers 选项,则必须指定 --require-kubeconfig 选项后才从配置文件中读取 kube-apiserver 的地址,否则 kubelet 启动后将找不到 kube-apiserver (日志中提示未找到 API Server),kubectl get nodes 不会返回对应的 Node 信息; --require-kubeconfig 在1.10版本被移除,参看PR
  • --cluster-dns 指定 kubedns 的 Service IP(可以先分配,后续创建 kubedns 服务时指定该 IP),--cluster-domain 指定域名后缀,这两个参数同时指定后才会生效;
  • --cluster-domain 指定 pod 启动时 /etc/resolve.conf 文件中的 search domain ,起初我们将其配置成了 cluster.local.,这样在解析 service 的 DNS 名称时是正常的,可是在解析 headless service 中的 FQDN pod name 的时候却错误,因此我们将其修改为 cluster.local,去掉最后面的 ”点号“ 就可以解决该问题,关于 kubernetes 中的域名/服务名称解析请参见我的另一篇文章。
  • --kubeconfig=/etc/kubernetes/kubelet.kubeconfig中指定的kubelet.kubeconfig文件在第一次启动kubelet之前并不存在,请看下文,当通过CSR请求后会自动生成kubelet.kubeconfig文件,如果你的节点上已经生成了~/.kube/config文件,你可以将该文件拷贝到该路径下,并重命名为kubelet.kubeconfig,所有node节点可以共用同一个kubelet.kubeconfig文件,这样新添加的节点就不需要再创建CSR请求就能自动添加到kubernetes集群中。同样,在任意能够访问到kubernetes集群的主机上使用kubectl --kubeconfig命令操作集群时,只要使用~/.kube/config文件就可以通过权限认证,因为这里面已经有认证信息并认为你是admin用户,对集群拥有所有权限。
  • KUBELET_POD_INFRA_CONTAINER 是基础镜像容器,这里我用的是私有镜像仓库地址,大家部署的时候需要修改为自己的镜像。可以使用Google的pause镜像gcr.io/google_containers/pause-amd64:3.0,这个镜像只有300多K。

2) 创建kubelet的service配置文件

文件位置:/usr/lib/systemd/system/kubelet.service

[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet \
            $KUBE_LOGTOSTDERR \
            $KUBE_LOG_LEVEL \
            $KUBELET_API_SERVER \
            $KUBELET_ADDRESS \
            $KUBELET_PORT \
            $KUBELET_HOSTNAME \
            $KUBE_ALLOW_PRIV \
            $KUBELET_POD_INFRA_CONTAINER \
            $KUBELET_ARGS
Restart=on-failure

[Install]
WantedBy=multi-user.target

注意:上述两种方式都可以创建kubelet服务,个人建议采用脚本一键式执行所有任务,采用第二种方式配置时,需要手动创建工作目录:/var/lib/kubelet。此处不再演示。

(5)通过kubelet的TLS证书请求

kubelet 首次启动时向 kube-apiserver 发送证书签名请求,必须通过后 kubernetes 系统才会将该 Node 加入到集群。

1)在master节点上查看未授权的CSR请求

[root@k8s-master ~]# kubectl get csr
NAME                                                   AGE    REQUESTOR           CONDITION
node-csr-4799pnHJjREEcWDGgSFvNaoyfcn4HiOML9cpEI1IbMs   3h6m   kubelet-bootstrap   Pending
node-csr-e3mql7Dm878tLhPUxu2pzg8e8eM17Togc6lHQX-mXZs   3h     kubelet-bootstrap   Pending

2)通过CSR请求

[root@k8s-master ~]# kubectl certificate approve node-csr-4799pnHJjREEcWDGgSFvNaoyfcn4HiOML9cpEI1IbMs
certificatesigningrequest.certificates.k8s.io/node-csr-4799pnHJjREEcWDGgSFvNaoyfcn4HiOML9cpEI1IbMs approved
[root@k8s-master ~]# kubectl certificate approve node-csr-e3mql7Dm878tLhPUxu2pzg8e8eM17Togc6lHQX-mXZs
certificatesigningrequest.certificates.k8s.io/node-csr-e3mql7Dm878tLhPUxu2pzg8e8eM17Togc6lHQX-mXZs approved
# 授权后发现两个node节点的csr已经approved.

3)自动生成了 kubelet kubeconfig 文件和公私钥

[root@k8s-node1 ~]# ls -l /etc/kubernetes/kubelet.kubeconfig
-rw------- 1 root root 2294 Jun 14 15:19 /etc/kubernetes/kubelet.kubeconfig
[root@k8s-node1 ~]# ls -l /etc/kubernetes/ssl/kubelet*
-rw------- 1 root root 1273 Jun 14 15:19 /etc/kubernetes/ssl/kubelet-client-2019-06-14-15-19-10.pem
lrwxrwxrwx 1 root root   58 Jun 14 15:19 /etc/kubernetes/ssl/kubelet-client-current.pem -> /etc/kubernetes/ssl/kubelet-client-2019-06-14-15-19-10.pem
-rw-r--r-- 1 root root 2177 Jun 14 11:50 /etc/kubernetes/ssl/kubelet.crt
-rw------- 1 root root 1679 Jun 14 11:50 /etc/kubernetes/ssl/kubelet.key


假如你更新kubernetes的证书,只要没有更新token.csv,当重启kubelet后,该node就会自动加入到kuberentes集群中,而不会重新发送certificaterequest,也不需要在master节点上执行kubectl certificate approve操作。前提是不要删除node节点上的/etc/kubernetes/ssl/kubelet*/etc/kubernetes/kubelet.kubeconfig文件。否则kubelet启动时会提示找不到证书而失败。

[root@k8s-master ~]# scp /etc/kubernetes/token.csv root@172.16.4.13:/etc/kubernetes/   
[root@k8s-master ~]# scp /etc/kubernetes/token.csv root@172.16.4.14:/etc/kubernetes/

**注意:**如果启动kubelet的时候见到证书相关的报错,有个trick可以解决这个问题,可以将master节点上的~/.kube/config文件(该文件在

[安装kubectl命令行工具]:

这一步中将会自动生成)拷贝到node节点的/etc/kubernetes/kubelet.kubeconfig位置,这样就不需要通过CSR,当kubelet启动后就会自动加入的集群中。注意同时记得也把.kube/config中的内容复制粘贴到/etc/kubernetes/kubelet.kubeconfig中,替换原先内容。

[root@k8s-master ~]# cat .kube/config 
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUR2akNDQXFhZ0F3SUJBZ0lVZkJtL2lzNG1EcHdqa0M0aVFFTWF5SVJaVHVjd0RRWUpLb1pJaHZjTkFRRUwKQlFBd1pURUxNQWtHQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFXcHBibWN4RURBT0JnTlZCQWNUQjBKbAphV3BwYm1jeEREQUtCZ05WQkFvVEEyczRjekVQTUEwR0ExVUVDeE1HVTNsemRHVnRNUk13RVFZRFZRUURFd3ByCmRXSmxjbTVsZEdWek1CNFhEVEU1TURZeE1qQXpNRFF3TUZvWERUSTVNRFl3T1RBek1EUXdNRm93WlRFTE1Ba0cKQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFXcHBibWN4RURBT0JnTlZCQWNUQjBKbGFXcHBibWN4RERBSwpCZ05WQkFvVEEyczRjekVQTUEwR0ExVUVDeE1HVTNsemRHVnRNUk13RVFZRFZRUURFd3ByZFdKbGNtNWxkR1Z6Ck1JSUJJakFOQmdrcWhraUc5dzBCQVFFRkFBT0NBUThBTUlJQkNnS0NBUUVBOEZQK2p0ZUZseUNPVDc0ZzRmd1UKeDl0bDY3dGVabDVwTDg4ZStESzJMclBJZDRXMDRvVDdiWTdKQVlLT3dPTkM4RjA5MzNqSjVBdmxaZmppTkJCaQp2OTlhYU5tSkdxeWozMkZaaDdhTkYrb3Fab3BYdUdvdmNpcHhYTWlXbzNlVHpWVUh3d2FBeUdmTS9BQnE0WUY0ClprSVV5UkJaK29OVXduY0tNaStOR2p6WVJyc2owZEJRR0ROZUJ6OEgzbCtjd1U1WmpZdEdFUFArMmFhZ1k5bG0KbjhyOUFna2owcW9uOEdQTFlRb2RDYzliSWZqQmVNaGIzaHJGMjJqMDhzWTczNzh3MzN5VWRHdjg1YWpuUlp6UgpIYkN6UytYRGJMTTh2aGh6dVZoQmt5NXNrWXB6M0hCNGkrTnJPR1Fmdm4yWkY0ZFh4UVUyek1Dc2NMSVppdGg0Ckt3SURBUUFCbzJZd1pEQU9CZ05WSFE4QkFmOEVCQU1DQVFZd0VnWURWUjBUQVFIL0JBZ3dCZ0VCL3dJQkFqQWQKQmdOVkhRNEVGZ1FVeTVmVncxK0s2N1dvblRuZVgwL2dFSTVNM3FJd0h3WURWUjBqQkJnd0ZvQVV5NWZWdzErSwo2N1dvblRuZVgwL2dFSTVNM3FJd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFOb3ZXa1ovd3pEWTZSNDlNNnpDCkhoZlZtVGk2dUZwS24wSmtvMVUzcHA5WTlTTDFMaXVvK3VwUjdJOCsvUXd2Wm95VkFWMTl4Y2hRQ25RSWhRMEgKVWtybXljS0crdWtsSUFUS3ZHenpzNW1aY0NQOGswNnBSSHdvWFhRd0ZhSFBpNnFZWDBtaW10YUc4REdzTk01RwpQeHdZZUZncXBLQU9Tb0psNmw5bXErQnhtWEoyZS8raXJMc3N1amlPKzJsdnpGOU5vU29Yd1RqUGZndXhRU3VFCnZlSS9pTXBGV1o0WnlCYWJKYkw5dXBldm53RTA2RXQrM2g2N3JKOU5mZ2N5MVhNSU0xeGo1QXpzRXgwVE5ETGkKWGlOQ0Zram9zWlA3U3dZdE5ncHNuZmhEandHRUJLbXV1S3BXR280ZWNac2lMQXgwOTNaeTdKM2dqVDF6dGlFUwpzQlE9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
    server: https://172.16.4.12:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: admin
  name: kubernetes
current-context: kubernetes
kind: Config
preferences: {}
users:
- name: admin
  user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUQzVENDQXNXZ0F3SUJBZ0lVVmlPdjZ6aFlHMzIzdWRZS2RFWEcvRVJENW8wd0RRWUpLb1pJaHZjTkFRRUwKQlFBd1pURUxNQWtHQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFXcHBibWN4RURBT0JnTlZCQWNUQjBKbAphV3BwYm1jeEREQUtCZ05WQkFvVEEyczRjekVQTUEwR0ExVUVDeE1HVTNsemRHVnRNUk13RVFZRFZRUURFd3ByCmRXSmxjbTVsZEdWek1CNFhEVEU1TURZeE1qQTJORGd3TUZvWERUSTVNRFl3T1RBMk5EZ3dNRm93YXpFTE1Ba0cKQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFVcHBibWN4RURBT0JnTlZCQWNUQjBKbGFVcHBibWN4RnpBVgpCZ05WQkFvVERuTjVjM1JsYlRwdFlYTjBaWEp6TVE4d0RRWURWUVFMRXdaVGVYTjBaVzB4RGpBTUJnTlZCQU1UCkJXRmtiV2x1TUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUFuL29MQVpCcENUdWUKci95eU15a1NYelBpWk9mVFdZQmEwNjR6c2Y1Y1Z0UEt2cnlCSjVHVlVSUlFUc2F3eWdFdnFBSXI3TUJrb21GOQpBeFVNaFNxdlFjNkFYemQzcjRMNW1CWGQxZ3FoWVNNR2lJL3hEMG5RaEF1azBFbVVONWY5ZENZRmNMMTVBVnZSCituN2wwaVcvVzlBRjRqbXRtYUtLVUdsUU9vNzQ3anNCYWRndU9SVHBMSkwxUGw3SlVLZnFBWktEbFVXZnpwZXcKOE1ETVMzN1FodmVQc24va2RwUVZ0bzlJZWcwSFhBcXlmZHNaZjZKeGdaS1FmUUNyYlJEMkd2L29OVVRlYnpWMwpWVm9ueEpUYmFrZFNuOHR0cCtLWFlzTUYvQy8wR29sL1JkS1Mrc0t4Z2hUUWdJMG5CZXJBM0x0dGp6WVpySWJBClo0RXBRNmc0ZFFJREFRQUJvMzh3ZlRBT0JnTlZIUThCQWY4RUJBTUNCYUF3SFFZRFZSMGxCQll3RkFZSUt3WUIKQlFVSEF3RUdDQ3NHQVFVRkJ3TUNNQXdHQTFVZEV3RUIvd1FDTUFBd0hRWURWUjBPQkJZRUZCQThrdnFaVDhRRApaSnIvTUk2L2ZWalpLdVFkTUI4R0ExVWRJd1FZTUJhQUZNdVgxY05maXV1MXFKMDUzbDlQNEJDT1RONmlNQTBHCkNTcUdTSWIzRFFFQkN3VUFBNElCQVFDMnZzVDUwZVFjRGo3RVUwMmZQZU9DYmJ6cFZWazEzM3NteGI1OW83YUgKRDhONFgvc3dHVlYzU0V1bVNMelJYWDJSYUsyUU04OUg5ZDlpRkV2ZzIvbjY3VThZeVlYczN0TG9Ua29NbzlUZgpaM0FNN0NyM0V5cWx6OGZsM3p4cmtINnd1UFp6VWNXV29vMUJvR1VCbEM1Mi9EbFpQMkZCbHRTcWtVL21EQ3IxCnJJWkFYYjZDbXNNZG1SQzMrYWwxamVUak9MZEcwMUd6dlBZdEdsQ0p2dHRJNzBuVkR3Nkh3QUpkRVN0UUh0cWsKakpCK3NZU2NSWDg1YTlsUXVIU21DY0kyQWxZQXFkK0t2NnNKNUVFZnpwWHNUVXdya0tKbjJ0UTN2UVNLaEgyawpabUx2N0MvcWV6YnJvc3pGeHNZWEtRelZiODVIVkxBbXo2UVhYV1I2Q0ZzMAotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
    client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBbi9vTEFaQnBDVHVlci95eU15a1NYelBpWk9mVFdZQmEwNjR6c2Y1Y1Z0UEt2cnlCCko1R1ZVUlJRVHNhd3lnRXZxQUlyN01Ca29tRjlBeFVNaFNxdlFjNkFYemQzcjRMNW1CWGQxZ3FoWVNNR2lJL3gKRDBuUWhBdWswRW1VTjVmOWRDWUZjTDE1QVZ2UituN2wwaVcvVzlBRjRqbXRtYUtLVUdsUU9vNzQ3anNCYWRndQpPUlRwTEpMMVBsN0pVS2ZxQVpLRGxVV2Z6cGV3OE1ETVMzN1FodmVQc24va2RwUVZ0bzlJZWcwSFhBcXlmZHNaCmY2SnhnWktRZlFDcmJSRDJHdi9vTlVUZWJ6VjNWVm9ueEpUYmFrZFNuOHR0cCtLWFlzTUYvQy8wR29sL1JkS1MKK3NLeGdoVFFnSTBuQmVyQTNMdHRqellackliQVo0RXBRNmc0ZFFJREFRQUJBb0lCQUE1cXFDZEI3bFZJckNwTAo2WHMyemxNS0IvTHorVlh0ZlVIcVJ2cFpZOVRuVFRRWEpNUitHQ2l3WGZSYmIzOGswRGloeVhlU2R2OHpMZUxqCk9MZWZleC9CRGt5R1lTRE4rdFE3MUR2L3hUOU51cjcveWNlSTdXT1k4UWRjT2lFd2IwVFNVRmN5bS84RldVenIKdHFaVGhJVXZuL2dkSG9uajNmY1ZKb2ZBYnFwNVBrLzVQd2hFSU5Pdm1FTFZFQWl6VnBWVmwxNzRCSGJBRHU1Sgp2Nm9xc0h3SUhwNC9ZbGo2NHhFVUZ1ZFA2Tkp0M1B5Uk14dW5RcWd3SWZ1bktuTklRQmZEVUswSklLK1luZmlJClgrM1lQam5sWFU3UnhYRHRFa3pVWTFSTTdVOHJndHhiNWRQWnhocGgyOFlFVnJBVW5RS2RSTWdCVVNad3hWRUYKeFZqWmVwa0NnWUVBeEtHdXExeElHNTZxL2RHeGxDODZTMlp3SkxGajdydTkrMkxEVlZsL2h1NzBIekJ6dFFyNwpMUGhUZnl2SkVqNTcwQTlDbk4ybndjVEQ2U1dqbkNDbW9ESk10Ti9iZlJaMThkZTU4b0JCRDZ5S0JGbmV1eWkwCk1oVWFmSzN5M091bGkxMjBKS3lQb2hvN1lyWUxNazc1UzVEeVRGMlEyV3JYY0VQaTlVRzNkNzhDZ1lFQTBFY3YKTUhDbE9XZ1hJUVNXNCtreFVEVXRiOFZPVnpwYjd3UWZCQ3RmSTlvTDBnVWdBd1M2U0lub2tET3ozdEl4aXdkQQpWZTVzMklHbVAzNS9qdm5FbThnaE1XbEZ3eHB5ZUxKK0hraTl1dFNPblJGWHYvMk9JdjBYbE01RlY5blBmZ01NCkMxQ09zZklKaVREaXJFOGQrR2cxV010dWxkVGo4Z0JKazRQRXZNc0NnWUJoNHA4aWZVa0VQdU9lZ1hJbWM3QlEKY3NsbTZzdjF2NDVmQTVaNytaYkxwRTd3Njl6ZUJuNXRyNTFaVklHL1RFMjBrTFEzaFB5TE1KbmFpYnM5OE44aQpKb2diRHNta0pyZEdVbjhsNG9VQStZS25rZG1ZVURZTUxJZElCQXcvd0N0a0NweXdHUnRUdGoxVDhZMzNXR3N3CkhCTVN3dzFsdnBOTE52Qlg2WVFjM3dLQmdHOHAvenJJZExjK0lsSWlJL01EREtuMXFBbW04cGhGOHJtUXBvbFEKS05oMjBhWkh5LzB3Y2NpenFxZ0VvSFZHRk9GU2Zua2U1NE5yTjNOZUxmRCt5SHdwQmVaY2ZMcVVqQkoxbWpESgp2RkpTanNld2NQaHMrWWNkTkkvY3hGQU9WZHU0L3Aydlltb0JlQ3Q4SncrMnJwVmQ4Vk15U1JTNWF1eElVUHpsCjhJU2ZBb0dBVituYjJ3UGtwOVJ0NFVpdmR0MEdtRjErQ052YzNzY3JYb3RaZkt0TkhoT0o2UTZtUkluc2tpRWgKVnFQRjZ6U1BnVmdrT1hmU0xVQ3Y2cGdWR2J5d0plRWo1SElQRHFuU25vNFErZFl2TXozcWN5d1hLbFEyUjZpcAo3VE0wWHNJaGFMRDFmWUNjaDhGVHNiZHNrQUNZUHpzeEdBa1l2TnRDcDI5WExCRmZWbkE9Ci0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg==
# 分发.kube/config到各节点。
[root@k8s-master ~]# scp .kube/config root@172.16.4.13:/etc/kubernetes/ 
[root@k8s-master ~]# scp .kube/config root@172.16.4.14:/etc/kubernetes/
# 比如在node2的/etc/kubernetes/目录下则出现了config文件。
[root@k8s-node2 ~]# ls /etc/kubernetes/
bin  bootstrap.kubeconfig  config  kubelet  kubelet.kubeconfig  kube-proxy.kubeconfig  ssl  token.csv

配置kube-proxy

脚本方式配置

(1)编写kube-proxy.sh脚本内容如下(在各node上编写该脚本):

#!/bin/bash

NODE_ADDRESS=${1:-"172.16.4.13"}

cat <<EOF >/etc/kubernetes/kube-proxy

KUBE_PROXY_ARGS="--logtostderr=true \
--v=4 \
--hostname-override=${NODE_ADDRESS} \
--kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig"

EOF

cat <<EOF >/usr/lib/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Proxy
After=network.target

[Service]
EnvironmentFile=-/etc/kubernetes/kube-proxy
ExecStart=/usr/local/bin/kube-proxy \$KUBE_PROXY_ARGS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload && systemctl enable kube-proxy
systemctl restart kube-proxy && systemctl status kube-proxy
  • --hostname-override 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 iptables 规则;
  • kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr--masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
  • --kubeconfig 指定的配置文件嵌入了 kube-apiserver 的地址、用户名、证书、秘钥等请求和认证信息;
  • 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

完整 unit 见 kube-proxy.service

(2)执行脚本

# 首先将前端master的kube-proxy命令拷贝至各个节点。
[root@k8s-master ~]# scp /usr/local/bin/kube-proxy root@172.16.4.13:/usr/local/bin/   
[root@k8s-master ~]# scp /usr/local/bin/kube-proxy root@172.16.4.14:/usr/local/bin/
# 并在各个节点上更改执行权限。
chmod +x kube-proxy.sh
[root@k8s-node2 ~]# ./kube-proxy.sh 172.16.4.14
Created symlink from /etc/systemd/system/multi-user.target.wants/kube-proxy.service to /usr/lib/systemd/system/kube-proxy.service.
● kube-proxy.service - Kubernetes Proxy
   Loaded: loaded (/usr/lib/systemd/system/kube-proxy.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2019-06-14 16:01:47 CST; 39ms ago
 Main PID: 117068 (kube-proxy)
    Tasks: 10
   Memory: 8.8M
   CGroup: /system.slice/kube-proxy.service
           └─117068 /usr/local/bin/kube-proxy --logtostderr=true --v=4 --hostname-override=172.16.4.14 --kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig

Jun 14 16:01:47 k8s-node2 systemd[1]: Started Kubernetes Proxy.

(3) --kubeconfig=/etc/kubernetes/kubelet.kubeconfig中指定的kubelet.kubeconfig文件在第一次启动kubelet之前并不存在,请看下文,当通过CSR请求后会自动生成kubelet.kubeconfig文件,如果你的节点上已经生成了~/.kube/config文件,你可以将该文件拷贝到该路径下,并重命名为kubelet.kubeconfig,所有node节点可以共用同一个kubelet.kubeconfig文件,这样新添加的节点就不需要再创建CSR请求就能自动添加到kubernetes集群中。同样,在任意能够访问到kubernetes集群的主机上使用kubectl --kubeconfig命令操作集群时,只要使用~/.kube/config`文件就可以通过权限认证,因为这里面已经有认证信息并认为你是admin用户,对集群拥有所有权限。

[root@k8s-master ~]# scp .kube/config root@172.16.4.13:/etc/kubernetes/
[root@k8s-node1 ~]# mv config kubelet.kubeconfig
[root@k8s-master ~]# scp .kube/config root@172.16.4.14:/etc/kubernetes/
[root@k8s-node2 ~]# mv config kubelet.kubeconfig

验证测试
# 以下操作在master节点上运行。
[root@k8s-master ~]# kubectl get nodes
NAME          STATUS   ROLES    AGE     VERSION
172.16.4.13   Ready    <none>   66s     v1.14.3
172.16.4.14   Ready    <none>   7m14s   v1.14.3

[root@k8s-master ~]# kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"} 

# 以nginx服务测试集群可用性
[root@k8s-master ~]# kubectl run nginx --replicas=3 --labels="run=load-balancer-example" --image=nginx  --port=80
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
deployment.apps/nginx created
[root@k8s-master ~]# kubectl expose deployment nginx --type=NodePort --name=example-service
service/example-service exposed

[root@k8s-master ~]# kubectl describe svc example-service
Name:                     example-service
Namespace:                default
Labels:                   run=load-balancer-example
Annotations:              <none>
Selector:                 run=load-balancer-example
Type:                     NodePort
IP:                       10.10.10.222
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  40905/TCP
Endpoints:                172.17.0.2:80,172.17.0.2:80,172.17.0.3:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

# 在node节点上访问
[root@k8s-node1 ~]# curl "10.10.10.222:80"
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

# 外网测试访问
[root@k8s-master ~]# kubectl get svc
NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
example-service   NodePort    10.10.10.222   <none>        80:40905/TCP   6m26s
kubernetes        ClusterIP   10.10.10.1     <none>        443/TCP        21h
# 由上可知,服务暴露外网端口为40905.输入172.16.4.12:40905即可访问。

DNS服务搭建与配置

从k8s v1.11版本开始,Kubernetes集群的DNS服务由CoreDNS提供。它是CNCF基金会的一个项目,使用Go语言实现的高性能、插件式、易扩展的DNS服务端。它解决了KubeDNS的一些问题,如dnsmasq的安全漏洞,externalName不能使用stubDomains设置等。

安装CoreDNS插件

官方的yaml文件目录:https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/coredns

在部署CoreDNS应用前,至少需要创建一个ConfigMap,一个Deployment和一个Service共3个资源对象。在启用了RBAC的集群中,还可以设置ServiceAccount、ClusterRole、ClusterRoleBinding对CoreDNS容器进行权限限制。

(1)为了起到镜像加速的作用,首先将docker的配置源更改为国内阿里云

cat << EOF > /etc/docker/daemon.json
{
      "registry-mirrors":["https://registry.docker-cn.com","https://h23rao59.mirror.aliyuncs.com"]
}
EOF

# 重新载入配置并重启docker
[root@k8s-master ~]# systemctl daemon-reload && systemctl restart docker

(2)此处将svc,configmap,ServiceAccount等写在一个yaml文件里,coredns.yaml内容见下。

[root@k8s-master ~]# cat coredns.yaml 
apiVersion: v1
kind: ServiceAccount
metadata:
  name: coredns
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  - services
  - pods
  - namespaces
  verbs:
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:coredns
subjects:
- kind: ServiceAccount
  name: coredns
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           upstream
           fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        proxy . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: kube-dns
  name: coredns
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kube-dns
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      containers:
      - args:
        - -conf
        - /etc/coredns/Corefile
        image:  docker.io/fengyunpan/coredns:1.2.6
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 5
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        name: coredns
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - all
          procMount: Default
          readOnlyRootFilesystem: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/coredns
          name: config-volume
          readOnly: true
      dnsPolicy: Default
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: coredns
      serviceAccountName: coredns
      terminationGracePeriodSeconds: 30
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
      volumes:
      - configMap:
          defaultMode: 420
          items:
          - key: Corefile
            path: Corefile
          name: coredns
        name: config-volume

---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: KubeDNS
  name: kube-dns
  namespace: kube-system
spec:
  selector:
    k8s-app: coredns
  clusterIP: 10.10.10.2
  ports:
  - name: dns
    port: 53
    protocol: UDP
    targetPort: 53
  - name: dns-tcp
    port: 53
    protocol: TCP
    targetPort: 53
  selector:
    k8s-app: kube-dns

  • clusterIP: 10.10.10.2是我集群各节点的DNS服务器IP,注意修改。并且在各node节点的kubelet的启动参数中加入以下两个参数:
    • –cluster-dns=10.10.10.2:为DNS服务的ClusterIP地址。
    • –cluster-domain=cluster.local:为在DNS服务中设置的域名

然后重启kubelet服务。

(3)通过kubectl create创建CoreDNS服务。

[root@k8s-master ~]# kubectl create -f coredns.yaml 
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.extensions/coredns created
service/kube-dns created
[root@k8s-master ~]# kubectl get all -n kube-system
NAME                           READY   STATUS    RESTARTS   AGE
pod/coredns-5fc7b65789-rqk6f   1/1     Running   0          20s

NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
service/kube-dns   ClusterIP   10.10.10.2   <none>        53/UDP,53/TCP   20s

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/coredns   1/1     1            1           20s

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/coredns-5fc7b65789   1         1         1       20s

(4)验证DNS服务

接下来使用一个带有nslookup工具的Pod来验证DNS服务是否能正常工作:

  • 创建busybox.yaml内容如下:

    [root@k8s-master ~]# cat busybox.yaml 
    apiVersion: v1
    kind: Pod
    metadata:
      name: busybox
      namespace: default
    spec:
      containers:
      - name: busybox
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/busybox
        command:
          - sleep
          - "3600"
        imagePullPolicy: IfNotPresent
      restartPolicy: Always
    
    
  • 采用kubectl apply命令创建pod

    [root@k8s-master ~]# kubectl apply -f busybox.yaml
    pod/busybox created
    # 采用kubectl describe命令发现busybox创建成功
    [root@k8s-master ~]# kubectl describe po/busybox
    .......
    Events:
      Type    Reason     Age   From                  Message
      ----    ------     ----  ----                  -------
      Normal  Scheduled  4s    default-scheduler     Successfully assigned default/busybox to 172.16.4.13
      Normal  Pulling    4s    kubelet, 172.16.4.13  Pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/busybox"
      Normal  Pulled     1s    kubelet, 172.16.4.13  Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/google_containers/busybox"
      Normal  Created    1s    kubelet, 172.16.4.13  Created container busybox
      Normal  Started    1s    kubelet, 172.16.4.13  Started container busybox
    
    
  • 在容器成功启动后,通过kubectl exec <contaier_name> nslookup进行测试。

[root@k8s-master ~]# kubectl exec busybox -- nslookup kubernetes
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes
Address 1: 10.10.10.1 kubernetes.default.svc.cluster.local

注意:如果某个Service属于不同的命名空间,那么在进行Service查找时,需要补充Namespace的名称,组合完整的域名,下面以查找kube-dns服务为例,将其所在的Namespace“kube-system”补充在服务名之后,用“.”连接为”kube-dns.kube-system“,即可查询成功:

# 错误案例,没有指定namespace
[root@k8s-master ~]# kubectl exec busybox -- nslookup kube-dns
nslookup: can't resolve 'kube-dns'
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

command terminated with exit code 1

# 成功案例。
[root@k8s-master ~]# kubectl exec busybox -- nslookup kube-dns.kube-system
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

Name:      kube-dns.kube-system
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

安装dashboard插件

Kubernetes的Web UI网页管理工具kubernetes-dashboard可提供部署应用、资源对象管理、容器日志查询、系统监控等常用的集群管理功能。为了在页面上显示系统资源的使用情况,要求部署Metrics Server。参考:

dashboard官方文件目录:

https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dashboard

由于 kube-apiserver 启用了 RBAC 授权,而官方源码目录的 dashboard-controller.yaml 没有定义授权的 ServiceAccount,所以后续访问 API server 的 API 时会被拒绝,不过从k8s v.18.3中官方文档提供了dashboard.rbac.yaml文件。

(1)创建部署文件kubernetes-dashboard.yaml,其内容如下:

# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# ------------------- Dashboard Secret ------------------- #

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kube-system
type: Opaque

---
# ------------------- Dashboard Service Account ------------------- #

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system

---
# ------------------- Dashboard Role & Role Binding ------------------- #

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: kubernetes-dashboard-minimal
  namespace: kube-system
rules:
  # Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret.
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["create"]
  # Allow Dashboard to create 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["create"]
  # Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"]
  verbs: ["get", "update", "delete"]
  # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["kubernetes-dashboard-settings"]
  verbs: ["get", "update"]
  # Allow Dashboard to get metrics from heapster.
- apiGroups: [""]
  resources: ["services"]
  resourceNames: ["heapster"]
  verbs: ["proxy"]
- apiGroups: [""]
  resources: ["services/proxy"]
  resourceNames: ["heapster", "http:heapster:", "https:heapster:"]
  verbs: ["get"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubernetes-dashboard-minimal
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubernetes-dashboard-minimal
subjects:
- kind: ServiceAccount
  name: kubernetes-dashboard
  namespace: kube-system

---
# ------------------- Dashboard Deployment ------------------- #

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
    spec:
      containers:
      - name: kubernetes-dashboard
        image: lizhenliang/kubernetes-dashboard-amd64:v1.10.1
        ports:
        - containerPort: 8443
          protocol: TCP
        args:
          - --auto-generate-certificates
          # Uncomment the following line to manually specify Kubernetes API server Host
          # If not specified, Dashboard will attempt to auto discover the API server and connect
          # to it. Uncomment only if the default does not work.
          # - --apiserver-host=http://my-address:port
        volumeMounts:
        - name: kubernetes-dashboard-certs
          mountPath: /certs
          # Create on-disk volume to store exec logs
        - mountPath: /tmp
          name: tmp-volume
        livenessProbe:
          httpGet:
            scheme: HTTPS
            path: /
            port: 8443
          initialDelaySeconds: 30
          timeoutSeconds: 30
      volumes:
      - name: kubernetes-dashboard-certs
        secret:
          secretName: kubernetes-dashboard-certs
      - name: tmp-volume
        emptyDir: {}
      serviceAccountName: kubernetes-dashboard
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule

---
# ------------------- Dashboard Service ------------------- #

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  type: NodePort
  ports:
    - port: 443
      targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard

(2)查看创建状态

[root@k8s-master ~]# kubectl get all -n kube-system | grep dashboard

pod/kubernetes-dashboard-7df98d85bd-jbwh2   1/1     Running   0          18m

service/kubernetes-dashboard   NodePort    10.10.10.91   <none>        443:41498/TCP   18m

deployment.apps/kubernetes-dashboard   1/1     1            1           18m
replicaset.apps/kubernetes-dashboard-7df98d85bd   1         1         1       18m

(3)此时可以通过node节点的31116端口进行访问。输入:https://172.16.4.13:41498https://172.16.4.14:41498.
[外链图片转存失败(img-1WX0mRfq-1564124466232)(C:\Users\49137\AppData\Roaming\Typora\typora-user-images\1560586168310.png)]
并且通过之前的CoreDNS能够解析到其服务的IP地址:

[root@k8s-master ~]# kubectl get svc -n kube-system
NAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGE
kube-dns               ClusterIP   10.10.10.2    <none>        53/UDP,53/TCP   5h38m
kubernetes-dashboard   NodePort    10.10.10.91   <none>        443:41498/TCP   26m
[root@k8s-master ~]# kubectl exec busybox -- nslookup kubernetes-dashboard.kube-system
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes-dashboard.kube-system
Address 1: 10.10.10.91 kubernetes-dashboard.kube-system.svc.cluster.local

(4)创建SA并绑定cluster-admin管理员集群角色

[root@k8s-master ~]# kubectl create serviceaccount dashboard-admin -n kube-system
serviceaccount/dashboard-admin created
[root@k8s-master ~]# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created
# 查看已创建的serviceaccount
[root@k8s-master ~]# kubectl get secret -n kube-system | grep admin
dashboard-admin-token-69zsx        kubernetes.io/service-account-token   3      65s
# 查看生成的token的具体信息并将token值复制到浏览器中,采用令牌登录。
[root@k8s-master ~]# kubectl describe secret dashboard-admin-token-69zsx -n kube-system
Name:         dashboard-admin-token-69zsx
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: dashboard-admin
              kubernetes.io/service-account.uid: dfe59297-8f46-11e9-b92b-e67418705759

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1359 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tNjl6c3giLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiZGZlNTkyOTctOGY0Ni0xMWU5LWI5MmItZTY3NDE4NzA1NzU5Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.Wl6WiT6MZ-37ArWhPuhudac5S1Y8v2GxiUdNcy4hIwHQ1EdtzaAlvpx1mLZsQoDYJCeM6swVtNgJwhO5ESZAYQVi9xCrXsQcEDIeBkjyzpu6U4XHmab7SuS0_KEsGXhe57XKq86ogK9bAyNvNWE497V2giJJy5eR6CHKH3GR6mIwTQDSKEf-GfDfs9SHvQxRjchsrYLJLS3B_XfZyNHFXcieMZHy7V7Ehx2jMzwh6WNk6Mqk5N-IlZQRxmTBHTe3i9efN8r7CjvRhZdKc5iF6V4eG0QWkxR95WOzgV2QCCyLh4xEJw895FlHFJ1oTR2sUIRugnzyfqZaPQxdXcrc7Q

(5)在浏览器中选择token方式登录,即可查看到集群的状态:
[外链图片转存失败(img-sDRmbHkK-1564124466233)(C:\Users\49137\AppData\Roaming\Typora\typora-user-images\1560587402581.png)]

  • 注意:访问dashboard实际上有三种方式,上述过程只演示了第一种方式:
    • kubernetes-dashboard 服务暴露了 NodePort,可以使用 http://NodeIP:nodePort 地址访问 dashboard。
    • 通过 API server 访问 dashboard(https 6443端口和http 8080端口方式)。
    • 通过 kubectl proxy 访问 dashboard。

采用kubectl proxy访问dashboard

(1)启动代理

[root@k8s-master ~]# kubectl proxy --address='172.16.4.12' --port=8086 --accept-hosts='^*$'
Starting to serve on 172.16.4.12:8086

(2)访问dashboard

访问URL:http://172.16.4.12:8086/ui 自动跳转到:http://172.16.4.12:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default
在这里插入图片描述

安装heapster插件

准备镜像

heapster release 页面 下载最新版本的 heapster。

wget https://github.com/kubernetes-retired/heapster/archive/v1.5.4.tar.gz
tar zxvf heapster-1.5.4.tar.gz 
[root@k8s-master ~]# cd heapster-1.5.4/deploy/kube-config/influxdb/ && ls
grafana.yaml  heapster.yaml  influxdb.yaml

(1)我们修改heapster.yaml后内容如下:

# ------------------- Heapster Service Account ------------------- #

apiVersion: v1
kind: ServiceAccount
metadata:
  name: heapster
  namespace: kube-system

---
# ------------------- Heapster Role & Role Binding ------------------- #

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: heapster
subjects:
  - kind: ServiceAccount
    name: heapster
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
---
# ------------------- Heapster Deployment ------------------- #
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: heapster
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: heapster
    spec:
      serviceAccountName: heapster
      containers:
      - name: heapster
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-amd64:v1.5.3
        imagePullPolicy: IfNotPresent
        command:
        - /heapster
        - --source=kubernetes:https://kubernetes.default
        - --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
---
# ------------------- Heapster Service ------------------- #

apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: Heapster
  name: heapster
  namespace: kube-system
spec:
  ports:
  - port: 80
    targetPort: 8082
  selector:
    k8s-app: heapster

(2)我们修改influxdb.yaml后内容如下:

[root@k8s-master influxdb]# cat influxdb.yaml 
# ------------------- Influxdb Deployment ------------------- #
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-influxdb
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: influxdb
    spec:
      containers:
      - name: influxdb
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-influxdb-amd64:v1.3.3
        volumeMounts:
        - mountPath: /data
          name: influxdb-storage
      volumes:
      - name: influxdb-storage
        emptyDir: {}
---
# ------------------- Influxdb Service ------------------- #

apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-influxdb
  name: monitoring-influxdb
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - port: 8086
    targetPort: 8086
    name: http
  - port: 8083
    targetPort: 8083
    name: admin
  selector:
    k8s-app: influxdb
---
#-------------------Influxdb Cm-----------------#
apiVersion: v1
kind: ConfigMap
metadata:
  name: influxdb-config
  namespace: kube-system
data:
  config.toml: |
    reporting-disabled = true
    bind-address = ":8088"
    [meta]
      dir = "/data/meta"
      retention-autocreate = true
      logging-enabled = true
    [data]
      dir = "/data/data"
      wal-dir = "/data/wal"
      query-log-enabled = true
      cache-max-memory-size = 1073741824
      cache-snapshot-memory-size = 26214400
      cache-snapshot-write-cold-duration = "10m0s"
      compact-full-write-cold-duration = "4h0m0s"
      max-series-per-database = 1000000
      max-values-per-tag = 100000
      trace-logging-enabled = false
    [coordinator]
      write-timeout = "10s"
      max-concurrent-queries = 0
      query-timeout = "0s"
      log-queries-after = "0s"
      max-select-point = 0
      max-select-series = 0
      max-select-buckets = 0
    [retention]
      enabled = true
      check-interval = "30m0s"
    [admin]
      enabled = true
      bind-address = ":8083"
      https-enabled = false
      https-certificate = "/etc/ssl/influxdb.pem"
    [shard-precreation]
      enabled = true
      check-interval = "10m0s"
      advance-period = "30m0s"
    [monitor]
      store-enabled = true
      store-database = "_internal"
      store-interval = "10s"
    [subscriber]
      enabled = true
      http-timeout = "30s"
      insecure-skip-verify = false
      ca-certs = ""
      write-concurrency = 40
      write-buffer-size = 1000
    [http]
      enabled = true
      bind-address = ":8086"
      auth-enabled = false
      log-enabled = true
      write-tracing = false
      pprof-enabled = false
      https-enabled = false
      https-certificate = "/etc/ssl/influxdb.pem"
      https-private-key = ""
      max-row-limit = 10000
      max-connection-limit = 0
      shared-secret = ""
      realm = "InfluxDB"
      unix-socket-enabled = false
      bind-socket = "/var/run/influxdb.sock"
    [[graphite]]
      enabled = false
      bind-address = ":2003"
      database = "graphite"
      retention-policy = ""
      protocol = "tcp"
      batch-size = 5000
      batch-pending = 10
      batch-timeout = "1s"
      consistency-level = "one"
      separator = "."
      udp-read-buffer = 0
    [[collectd]]
      enabled = false
      bind-address = ":25826"
      database = "collectd"
      retention-policy = ""
      batch-size = 5000
      batch-pending = 10
      batch-timeout = "10s"
      read-buffer = 0
      typesdb = "/usr/share/collectd/types.db"
    [[opentsdb]]
      enabled = false
      bind-address = ":4242"
      database = "opentsdb"
      retention-policy = ""
      consistency-level = "one"
      tls-enabled = false
      certificate = "/etc/ssl/influxdb.pem"
      batch-size = 1000
      batch-pending = 5
      batch-timeout = "1s"
      log-point-errors = true
    [[udp]]
      enabled = false
      bind-address = ":8089"
      database = "udp"
      retention-policy = ""
      batch-size = 5000
      batch-pending = 10
      read-buffer = 0
      batch-timeout = "1s"
      precision = ""
    [continuous_queries]
      log-enabled = true
      enabled = true
      run-interval = "1s"

(3)我们修改grafana.yaml后文件内容如下:

[root@k8s-master influxdb]# cat grafana.yaml 
#------------Grafana Deployment----------------#

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-grafana
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: grafana
    spec:
      containers:
      - name: grafana
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v4.4.3
        ports:
        - containerPort: 3000
          protocol: TCP
        volumeMounts:
        #- mountPath: /etc/ssl/certs
        #  name: ca-certificates
        #  readOnly: true
        - mountPath: /var
          name: grafana-storage
        env:
        - name: INFLUXDB_HOST
          value: monitoring-influxdb
        #- name: GF_SERVER_HTTP_PORT
        - name: GRAFANA_PORT
          value: "3000"
          # The following env variables are required to make Grafana accessible via
          # the kubernetes api-server proxy. On production clusters, we recommend
          # removing these env variables, setup auth for grafana, and expose the grafana
          # service using a LoadBalancer or a public IP.
        - name: GF_AUTH_BASIC_ENABLED
          value: "false"
        - name: GF_AUTH_ANONYMOUS_ENABLED
          value: "true"
        - name: GF_AUTH_ANONYMOUS_ORG_ROLE
          value: Admin
        - name: GF_SERVER_ROOT_URL
          # If you're only using the API Server proxy, set this value instead:
          value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
          # value: /
      volumes:
      # - name: ca-certificates
      #  hostPath:
      #    path: /etc/ssl/certs
      - name: grafana-storage
        emptyDir: {}
---
#------------Grafana Service----------------#

apiVersion: v1
kind: Service
metadata:
  labels:
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-grafana
  name: monitoring-grafana
  namespace: kube-system
spec:
  # In a production setup, we recommend accessing Grafana through an external Loadbalancer
  # or through a public IP.
  # type: LoadBalancer
  # You could also use NodePort to expose the service at a randomly-generated port
  # type: NodePort
  ports:
  - port: 80
    targetPort: 3000
  selector:
    k8s-app: grafana

执行所有定义文件
[root@k8s-master influxdb]# pwd
/root/heapster-1.5.4/deploy/kube-config/influxdb

[root@k8s-master influxdb]# ls
grafana.yaml  heapster.yaml  influxdb.yaml

[root@k8s-master influxdb]# kubectl create -f .
deployment.extensions/monitoring-grafana created
service/monitoring-grafana created
serviceaccount/heapster created
clusterrolebinding.rbac.authorization.k8s.io/heapster created
service/heapster created
deployment.extensions/heapster created
deployment.extensions/monitoring-influxdb created
service/monitoring-influxdb created
configmap/influxdb-config created
Error from server (AlreadyExists): error when creating "heapster.yaml": serviceaccounts "heapster" already exists
Error from server (AlreadyExists): error when creating "heapster.yaml": clusterrolebindings.rbac.authorization.k8s.io "heapster" already exists
Error from server (AlreadyExists): error when creating "heapster.yaml": services "heapster" already exists

检查执行结果
# 检查Deployment
[root@k8s-master influxdb]# kubectl get deployments -n kube-system | grep -E 'heapster|monitoring'
heapster               1/1     1            1           10m
monitoring-grafana     1/1     1            1           10m
monitoring-influxdb    1/1     1            1           10m

# 检查Pods
[root@k8s-master influxdb]# kubectl get pods -n kube-system | grep -E 'heapster|monitoring'
heapster-75d646bf58-9x9tz               1/1     Running   0          10m
monitoring-grafana-77997bd67d-5khvp     1/1     Running   0          10m
monitoring-influxdb-7d6c5fb944-jmrv6    1/1     Running   0          10m

访问各dashboard界面

错误一:system:anonymous问题

访问dashboard网页时,可能出现以下问题:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
    
  },
  "status": "Failure",
  "message": "services \"heapster\" is forbidden: User \"system:anonymous\" cannot get resource \"services/proxy\" in API group \"\" in the namespace \"kube-system\"",
  "reason": "Forbidden",
  "details": {
    "name": "heapster",
    "kind": "services"
  },
  "code": 403
}

分析问题Kubernetes API Server新增了–anonymous-auth选项,允许匿名请求访问secure port。没有被其他authentication方法拒绝的请求即Anonymous requests, 这样的匿名请求的usernamesystem:anonymous, 归属的组为system:unauthenticated。并且该选线是默认的。这样一来,当采用chrome浏览器访问dashboard UI时很可能无法弹出用户名、密码输入对话框,导致后续authorization失败。为了保证用户名、密码输入对话框的弹出,需要将–anonymous-auth设置为false

  • 再次访问dashboard发现多了CPU使用率和内存使用率的表格:
    在这里插入图片描述
    (2)访问grafana页面

  • 通过kube-apiserver访问:

获取 monitoring-grafana 服务 URL

[root@k8s-master ~]# kubectl cluster-info
Kubernetes master is running at https://172.16.4.12:6443
Heapster is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
monitoring-grafana is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
monitoring-influxdb is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

访问浏览器URL:https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
在这里插入图片描述

  • 通过kubectl proxy访问:

创建代理

[root@k8s-master ~]# kubectl proxy --address='172.16.4.12' --port=8084 --accept-hosts='^*$'
Starting to serve on 172.16.4.12:8084

访问influxdb admin UI

获取 influxdb http 8086 映射的 NodePort

[root@k8s-master influxdb]# kubectl get svc -n kube-system|grep influxdb
monitoring-influxdb    NodePort    10.10.10.154   <none>        8086:43444/TCP,8083:49123/TCP   53m

通过 kube-apiserver 的非安全端口访问 influxdb 的 admin UI 界面: http://172.16.4.12:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:8083/

在页面的 “Connection Settings” 的 Host 中输入 node IP, Port 中输入 8086 映射的 nodePort 如上面的 43444,点击 “Save” 即可(我的集群中的地址是172.16.4.12:32299)

  • 错误一:通过kube-apiserver访问不到influxdb dashboard,出现yaml文件内容。
{
  "kind": "Service",
  "apiVersion": "v1",
  "metadata": {
    "name": "monitoring-influxdb",
    "namespace": "kube-system",
    "selfLink": "/api/v1/namespaces/kube-system/services/monitoring-influxdb",
    "uid": "22c9ab6c-8f72-11e9-b92b-e67418705759",
    "resourceVersion": "215237",
    "creationTimestamp": "2019-06-15T13:33:18Z",
    "labels": {
      "kubernetes.io/cluster-service": "true",
      "kubernetes.io/name": "monitoring-influxdb",
      "task": "monitoring"
    }
  },
  "spec": {
    "ports": [
      {
        "name": "http",
        "protocol": "TCP",
        "port": 8086,
        "targetPort": 8086,
        "nodePort": 43444
      },
      {
        "name": "admin",
        "protocol": "TCP",
        "port": 8083,
        "targetPort": 8083,
        "nodePort": 49123
      }
    ],
    "selector": {
      "k8s-app": "influxdb"
    },
    "clusterIP": "10.10.10.154",
    "type": "NodePort",
    "sessionAffinity": "None",
    "externalTrafficPolicy": "Cluster"
  },
  "status": {
    "loadBalancer": {
      
    }
  }
}

安装EFK插件

在Kubernetes集群中,一个完整的应用或服务都会涉及为数众多的组件运行,各组件所在的Node及实例数量都是可变的。日志子系统如果不做集中化管理,则会给系统的运维支撑造成很大的困难,因此有必要在集群层面对日志进行统一收集和检索等工作。

在容器中输出到控制台的日志,都会以“*-json.log”的命名方式保存到/var/lib/docker/containers/目录下,这就为日志采集和后续处理奠定了基础。

Kubernetes推荐用Fluentd+Elasticsearch+Kibana完成对系统和容器日志的采集、查询和展现工作。

部署统一日志管理系统,需要以下两个前提条件:

  • API Server正确配置了CA证书。
  • DNS服务启动、运行。

系统部署架构

在这里插入图片描述
我们通过在每台node上部署一个以DaemonSet方式运行的fluentd来收集每台node上的日志。Fluentd将docker日志目录/var/lib/docker/containers/var/log目录挂载到Pod中,然后Pod会在node节点的/var/log/pods目录中创建新的目录,可以区别不同的容器日志输出,该目录下有一个日志文件链接到/var/lib/docker/contianers目录下的容器日志输出。注意:两个目录下的日志都会汇集到ElasticSearch集群,最终通过Kibana完成和用户的交互工作。

这里有一个特殊需求:Fluentd必须在每个Node上运行,为了满足这一需求,我们通过以下几种方式部署Fluentd。

  • 直接在Node主机上部署Fluentd.
  • 利用kubelet的–config参数,为每个node都加载Fluentd Pod。
  • 利用DaemonSet让Fluentd Pod在每个Node上运行。

官方文件目录:https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch

配置EFK服务配置文件

创建目录盛放文件
[root@k8s-master ~]# mkdir EFK && cd EFK

配置EFK-RABC服务
[root@k8s-master EFK]# cat efk-rbac.yaml 
apiVersion: v1
kind: ServiceAccount
metadata:
  name: efk
  namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: efk
subjects:
  - kind: ServiceAccount
    name: efk
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
# 注意配置的ServiceAccount为efk。

配置ElasticSearch服务
# 此处将官方的三个文档合并成了一个elasticsearch.yaml,内容如下:

[root@k8s-master EFK]# cat elasticsearch.yaml 
#------------ElasticSearch RBAC---------#

apiVersion: v1
kind: ServiceAccount
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: elasticsearch-logging
  labels:
    k8s-app: elasticsearch-logging
    addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
  - ""
  resources:
  - "services"
  - "namespaces"
  - "endpoints"
  verbs:
  - "get"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: kube-system
  name: elasticsearch-logging
  labels:
    k8s-app: elasticsearch-logging
    addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
  name: elasticsearch-logging
  namespace: kube-system
  apiGroup: ""
roleRef:
  kind: ClusterRole
  name: elasticsearch-logging
  apiGroup: ""
---

# -----------ElasticSearch Service--------------#
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "Elasticsearch"
spec:
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    k8s-app: elasticsearch-logging
---

#-------------------ElasticSearch StatefulSet-------#
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    version: v6.6.1
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  serviceName: elasticsearch-logging
  replicas: 2
  selector:
    matchLabels:
      k8s-app: elasticsearch-logging
      version: v6.7.2
  template:
    metadata:
      labels:
        k8s-app: elasticsearch-logging
        version: v6.7.2
    spec:
      serviceAccountName: elasticsearch-logging
      containers:
      - image: docker.elastic.co/elasticsearch/elasticsearch:6.6.1
        name: elasticsearch-logging
        resources:
          # need more cpu upon initialization, therefore burstable class
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        volumeMounts:
        - name: elasticsearch-logging
          mountPath: /data
        env:
        - name: "NAMESPACE"
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: ES_JAVA_OPTS
          value: -Xms1024m -Xmx1024m
      volumes:
      - name: elasticsearch-logging
        emptyDir: {}
       # Elasticsearch requires vm.max_map_count to be at least 262144.
       # If your OS already sets up this number to a higher value, feel free
       # to remove this init container.
      initContainers:
      - image: alpine:3.6
        command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
        name: elasticsearch-logging-init
        securityContext:
          privileged: true

配置Fluentd服务的configmap,此处通过td-agent创建
# td-agent提供了一个官方文档:个人感觉繁琐,可以直接采用其脚本安装。
curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh

# 正式配置configmap,其配置文件如下,可以自己手动创建。
[root@k8s-master fluentd-es-image]# cat td-agent.conf 
kind: ConfigMap
apiVersion: v1
metadata:
  name: td-agent-config
  namespace: kube-system
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  td-agent.conf: |
    <filter kubernetes.**>
      @type kubernetes_metadata
      tls-cert-file /etc/kubernetes/ssl/server.pem
      tls-private-key-file /etc/kubernetes/ssl/server-key.pem
      client-ca-file /etc/kubernetes/ssl/ca.pem
      service-account-key-file /etc/kubernetes/ssl/ca-key.pem
    </filter>

    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level info
      type_name _doc
      include_tag_key true
      host 172.16.4.12
      port 9200
      logstash_format true
      <buffer>
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        queue_limit_length 8
        overflow_action block
      </buffer>
    </match>

    <source>
    type null  
    type tail
    path /var/log/containers/*.log
    pos_file /var/log/es-containers.log.pos
    time_format %Y-%m-%dT%H:%M:%S.%NZ
    tag kubernetes.*
    format json
    read_from_head true
    </source>
    
# 注意将configmap创建在kube-system的名称空间下。
kubectl create configmap td-agent-config --from-file=./td-agent.conf -n kube-system

# 创建fluentd的DaemonSet
[root@k8s-master EFK]# cat fluentd.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd-es-v1.22
  namespace: kube-system
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    version: v1.22
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-es
        kubernetes.io/cluster-service: "true"
        version: v1.22
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      serviceAccountName: efk
      containers:
      - name: fluentd-es
        image: travix/fluentd-elasticsearch:1.22
        command:
          - '/bin/sh'
          - '-c'
          - '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      nodeSelector:
        beta.kubernetes.io/fluentd-ds-ready: "true"
      tolerations:
      - key : "node.alpha.kubernetes.io/ismaster"
        effect: "NoSchedule"
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
# 此处采用了一个dockerhub上公共镜像,官方镜像需要翻墙。

配置Kibana服务
[root@k8s-master EFK]# cat kibana.yaml 
#---------------Kibana Deployment-------------------#

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: kibana-logging
  template:
    metadata:
      labels:
        k8s-app: kibana-logging
      annotations:
        seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
    spec:
      serviceAccountName: efk
      containers:
      - name: kibana-logging
        image: docker.elastic.co/kibana/kibana-oss:6.6.1
        resources:
          # keep request = limit to keep this container in guaranteed class
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: "ELASTICSEARCH_URL"
            value: "http://172.16.4.12:9200"
          # modified by gzr
          #  value: "http://elasticsearch-logging:9200"
          - name: "SERVER_BASEPATH"
            value: "/api/v1/proxy/namespaces/kube-system/services/kibana-logging/proxy"
        ports:
        - containerPort: 5601
          name: ui
          protocol: TCP
---

#------------------Kibana Service---------------------#

apiVersion: v1
kind: Service
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "Kibana"
spec:
  ports:
  - port: 5601
    protocol: TCP
    targetPort: ui
  selector:
    k8s-app: kibana-logging


给Node设置标签

定义 DaemonSet fluentd-es-v1.22 时设置了 nodeSelector beta.kubernetes.io/fluentd-ds-ready=true ,所以需要在期望运行 fluentd 的 Node 上设置该标签;

[root@k8s-master EFK]# kubectl get nodes
NAME          STATUS   ROLES    AGE     VERSION
172.16.4.12   Ready    <none>   18h     v1.14.3
172.16.4.13   Ready    <none>   2d15h   v1.14.3
172.16.4.14   Ready    <none>   2d15h   v1.14.3

[root@k8s-master EFK]#kubectl label nodes 172.16.4.14 beta.kubernetes.io/fluentd-ds-ready=true
node "172.16.4.14" labeled

[root@k8s-master EFK]#kubectl label nodes 172.16.4.13 beta.kubernetes.io/fluentd-ds-ready=true
node "172.16.4.13" labeled

[root@k8s-master EFK]#kubectl label nodes 172.16.4.12 beta.kubernetes.io/fluentd-ds-ready=true
node "172.16.4.12" labeled

执行定义的文件
[root@k8s-master EFK]# kubectl create -f .
serviceaccount/efk created
clusterrolebinding.rbac.authorization.k8s.io/efk created
service/elasticsearch-logging created
serviceaccount/elasticsearch-logging created
clusterrole.rbac.authorization.k8s.io/elasticsearch-logging created
clusterrolebinding.rbac.authorization.k8s.io/elasticsearch-logging created
statefulset.apps/elasticsearch-logging created
daemonset.extensions/fluentd-es-v1.22 created
deployment.apps/kibana-logging created
service/kibana-logging created

验证执行结果
[root@k8s-master EFK]# kubectl get po -n kube-system -o wide| grep -E 'elastic|fluentd|kibana'
elasticsearch-logging-0                 1/1     Running            0          115m    172.30.69.5   172.16.4.14   <none>           <none>
elasticsearch-logging-1                 1/1     Running            0          115m    172.30.20.8   172.16.4.13   <none>           <none>
fluentd-es-v1.22-4bmtm                  0/1     CrashLoopBackOff   16         58m     172.30.53.2   172.16.4.12   <none>           <none>
fluentd-es-v1.22-f9hml                  1/1     Running            0          58m     172.30.69.6   172.16.4.14   <none>           <none>
fluentd-es-v1.22-x9rf4                  1/1     Running            0          58m     172.30.20.9   172.16.4.13   <none>           <none>
kibana-logging-7db9f954ff-mkbhr         1/1     Running            0          25s     172.30.69.7   172.16.4.14   <none>           <none>

kibana Pod 第一次启动时会用较长时间(10-20分钟)来优化和 Cache 状态页面,可以 tailf 该 Pod 的日志观察进度。

[root@k8s-master EFK]# kubectl logs kibana-logging-7db9f954ff-mkbhr -n kube-system
{"type":"log","@timestamp":"2019-06-18T09:23:33Z","tags":["plugin","warning"],"pid":1,"path":"/usr/share/kibana/src/legacy/core_plugins/ems_util","message":"Skipping non-plugin directory at /usr/share/kibana/src/legacy/core_plugins/ems_util"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["warning","elasticsearch","config","deprecation"],"pid":1,"message":"Config key \"url\" is deprecated. It has been replaced with \"hosts\""}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:kibana@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:elasticsearch@6.6.1","info"],"pid":1,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:console@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:interpreter@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:metrics@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:timelion@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:elasticsearch@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["listening","info"],"pid":1,"message":"Server running at http://0:5601"}
[root@k8s-master EFK]# kubectl logs kibana-logging-7db9f954ff-mkbhr -n kube-system
{"type":"log","@timestamp":"2019-06-18T09:23:33Z","tags":["plugin","warning"],"pid":1,"path":"/usr/share/kibana/src/legacy/core_plugins/ems_util","message":"Skipping non-plugin directory at /usr/share/kibana/src/legacy/core_plugins/ems_util"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["warning","elasticsearch","config","deprecation"],"pid":1,"message":"Config key \"url\" is deprecated. It has been replaced with \"hosts\""}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:kibana@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:elasticsearch@6.6.1","info"],"pid":1,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:console@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:interpreter@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:metrics@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:timelion@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:elasticsearch@6.6.1","info"],"pid":1,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["listening","info"],"pid":1,"message":"Server running at http://0:5601"}
......

访问kibana
  1. 通过kube-apiserver访问:

获取kibana服务URL

[root@k8s-master ~]# kubectl cluster-info
Kubernetes master is running at https://172.16.4.12:6443
Elasticsearch is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/elasticsearch-logging/proxy
Heapster is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/heapster/proxy
Kibana is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kibana-logging/proxy
KubeDNS is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
monitoring-grafana is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
monitoring-influxdb is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy

浏览器访问URL:https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kibana-logging/proxy

  • 错误1:Kibana did not load properly. Check the server output for more information.

在这里插入图片描述
解决办法:

  • 错误2:访问kibana,出现503错误,具体内容如下:
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
    
  },
  "status": "Failure",
  "message": "no endpoints available for service \"kibana-logging\"",
  "reason": "ServiceUnavailable",
  "code": 503
}

在这里插入图片描述

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐