[K8S] 认证集群搭建
注意:该文章 是 根据 https://github.com/opsnull/follow-me-install-kubernetes-cluster (follow-me-install-kubernetes-cluster)一步步做的。若有版权问题请留言,谢谢!其中自己根据实际情况做了若干变动。里面的ip与实际对应如下(有的地方ip或许没改):
注意:该文章 是 根据 https://github.com/opsnull/follow-me-install-kubernetes-cluster (follow-me-install-kubernetes-cluster) 一步步做的。
若有版权问题请留言,谢谢!
其中自己根据实际情况做了若干变动。
里面的ip与实际对应如下(有的地方ip或许没改):
10.64.3.7 | 192.168.1.206 | etcd-host0 |
10.64.3.8 | 192.168.1.207 | etcd-host1 |
10.64.3.86 | 192.168.1.208 | etcd-host2 |
01-组件版本和集群环境
集群组件和版本
- Kubernetes 1.6.2
- Docker 17.04.0-ce
- Etcd 3.1.6
- Flanneld 0.7.1 vxlan 网络
- TLS 认证通信 (所有组件,如 etcd、kubernetes master 和 node)
- RBAC 授权
- kubelet TLS BootStrapping
- kubedns、dashboard、heapster (influxdb、grafana)、EFK (elasticsearch、fluentd、kibana) 插件
- 私有 docker registry,使用 ceph rgw 后端存储,TLS + HTTP Basic 认证
集群机器
-
192.168.1.206
master、registry
192.168.1.207
node01
192.168.1.208
node02
本着测试的目的,etcd 集群、kubernetes master 集群、kubernetesnode 均使用这三台机器。
初始化系统,关闭 firewalld selinux .
分发集群环境变量定义脚本
把全局变量定义脚本拷贝到所有机器的 /root/local/bin 目录:
$ cp environment.sh /root/local/bin
$ vi /etc/profile
source /root/local/bin/environment.sh
:wq
192.168.1.206 environment.sh
#!/usr/bin/bash
BOOTSTRAP_TOKEN="41f7e4ba8b7be874fcff18bf5cf41a7c"
# 最好使用 主机未用的网段 来定义服务网段和 Pod 网段
# 服务网段 (Service CIDR),部署前路由不可达,部署后集群内使用IP:Port可达 SERVICE_CIDR="10.254.0.0/16"
# POD 网段 (Cluster CIDR),部署前路由不可达,**部署后**路由可达(flanneld保证) CLUSTER_CIDR="172.30.0.0/16"
# 服务端口范围 (NodePort Range) export NODE_PORT_RANGE="8400-9000"
# etcd 集群服务地址列表 export ETCD_ENDPOINTS="https://192.168.1.206:2379,https://192.168.1.207:2379,https://192.168.1.208:2379"
# flanneld 网络配置前缀 export FLANNEL_ETCD_PREFIX="/kubernetes/network"
# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP) export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1"
# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配) export CLUSTER_DNS_SVC_IP="10.254.0.2"
# 集群 DNS 域名 export CLUSTER_DNS_DOMAIN="cluster.local."
export NODE_NAME=etcd-host0 # 当前部署的机器名称(随便定义,只要能区分不同机器即可) export NODE_IP=192.168.1.206 # 当前部署的机器 IP export NODE_IPS="192.168.1.206 192.168.1.207 192.168.1.208" # etcd 集群所有机器 IP # etcd 集群间通信的IP和端口 export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380 # 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR
export PATH=/root/local/bin:$PATH export MASTER_IP=192.168.1.206 # 替换为 kubernetes maste 集群任一机器 IP export KUBE_APISERVER="https://${MASTER_IP}:6443" |
192.168.1.207 environment.sh
#!/usr/bin/bash
BOOTSTRAP_TOKEN="41f7e4ba8b7be874fcff18bf5cf41a7c"
# 最好使用 主机未用的网段 来定义服务网段和 Pod 网段
# 服务网段 (Service CIDR),部署前路由不可达,部署后集群内使用IP:Port可达 SERVICE_CIDR="10.254.0.0/16"
# POD 网段 (Cluster CIDR),部署前路由不可达,**部署后**路由可达(flanneld保证) CLUSTER_CIDR="172.30.0.0/16"
# 服务端口范围 (NodePort Range) export NODE_PORT_RANGE="8400-9000"
# etcd 集群服务地址列表 export ETCD_ENDPOINTS="https://192.168.1.206:2379,https://192.168.1.207:2379,https://192.168.1.208:2379"
# flanneld 网络配置前缀 export FLANNEL_ETCD_PREFIX="/kubernetes/network"
# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP) export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1"
# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配) export CLUSTER_DNS_SVC_IP="10.254.0.2"
# 集群 DNS 域名 export CLUSTER_DNS_DOMAIN="cluster.local."
export NODE_NAME=etcd-host1 # 当前部署的机器名称(随便定义,只要能区分不同机器即可) export NODE_IP=192.168.1.207 # 当前部署的机器 IP export NODE_IPS="192.168.1.206 192.168.1.207 192.168.1.208" # etcd 集群所有机器 IP # etcd 集群间通信的IP和端口 export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380 # 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR
export PATH=/root/local/bin:$PATH export MASTER_IP=192.168.1.206 # 替换为 kubernetes maste 集群任一机器 IP export KUBE_APISERVER="https://${MASTER_IP}:6443" |
192.168.1.208 environment.sh
#!/usr/bin/bash
BOOTSTRAP_TOKEN="41f7e4ba8b7be874fcff18bf5cf41a7c"
# 最好使用 主机未用的网段 来定义服务网段和 Pod 网段
# 服务网段 (Service CIDR),部署前路由不可达,部署后集群内使用IP:Port可达 SERVICE_CIDR="10.254.0.0/16"
# POD 网段 (Cluster CIDR),部署前路由不可达,**部署后**路由可达(flanneld保证) CLUSTER_CIDR="172.30.0.0/16"
# 服务端口范围 (NodePort Range) export NODE_PORT_RANGE="8400-9000"
# etcd 集群服务地址列表 export ETCD_ENDPOINTS="https://192.168.1.206:2379,https://192.168.1.207:2379,https://192.168.1.208:2379"
# flanneld 网络配置前缀 export FLANNEL_ETCD_PREFIX="/kubernetes/network"
# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP) export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1"
# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配) export CLUSTER_DNS_SVC_IP="10.254.0.2"
# 集群 DNS 域名 export CLUSTER_DNS_DOMAIN="cluster.local."
export NODE_NAME=etcd-host2 # 当前部署的机器名称(随便定义,只要能区分不同机器即可) export NODE_IP=192.168.1.208 # 当前部署的机器 IP export NODE_IPS="192.168.1.206 192.168.1.207 192.168.1.208" # etcd 集群所有机器 IP # etcd 集群间通信的IP和端口 export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380 # 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR
export PATH=/root/local/bin:$PATH
export MASTER_IP=192.168.1.206 # 替换为 kubernetes maste 集群任一机器 IP export KUBE_APISERVER="https://${MASTER_IP}:6443" |
02-创建CA证书和秘钥
创建 CA 证书和秘钥
kubernetes 系统各组件需要使用 TLS 证书对通信进行加密,本文档使用 CloudFlare 的 PKI工具集 cfssl 来生成Certificate Authority (CA) 证书和秘钥文件,CA 是自签名的证书,用来签名后续创建的其它 TLS 证书。
安装 CFSSL
$ wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
$ chmod +x cfssl_linux-amd64
$ sudo mv cfssl_linux-amd64 /root/local/bin/cfssl
$ wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
$ chmod +x cfssljson_linux-amd64
$ sudo mv cfssljson_linux-amd64 /root/local/bin/cfssljson
$ wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
$ chmod +x cfssl-certinfo_linux-amd64
$ sudo mv cfssl-certinfo_linux-amd64 /root/local/bin/cfssl-certinfo
$ exportPATH=/root/local/bin:$PATH
$ mkdir ssl
$ cd ssl
$ cfssl print-defaults config >config.json
$ cfssl print-defaults csr > csr.json
以上工具每个节点都要安装
创建 CA (Certificate Authority)
创建 CA 配置文件:
$ cat ca-config.json
{
"signing": {
"default": {
"expiry":"8760h"
},
"profiles": {
"kubernetes": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry":"8760h"
}
}
}
}
- ca-config.json:可以定义多个 profiles,分别指定不同的过期时间、使用场景等参数;后续在签名证书时使用某个 profile;
- signing:表示该证书可用于签名其它证书;生成的 ca.pem 证书中 CA=TRUE;
- server auth:表示 client 可以用该 CA 对 server 提供的证书进行验证;
- client auth:表示 server 可以用该 CA 对 client 提供的证书进行验证;
创建 CA 证书签名请求:
$ cat ca-csr.json
{
"CN":"kubernetes",
"key": {
"algo":"rsa",
"size": 2048
},
"names": [
{
"C":"CN",
"ST":"BeiJing",
"L":"BeiJing",
"O":"k8s",
"OU":"System"
}
]
}
- "CN":Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name);浏览器使用该字段验证网站是否合法;
- "O":Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group);
生成 CA 证书和私钥:
$ cfssl gencert -initca ca-csr.json| cfssljson -bare ca
$ ls ca*
ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem
$
分发证书
将生成的 CA 证书、秘钥文件、配置文件拷贝到所有机器的 /etc/kubernetes/ssl 目录下
$ sudo mkdir -p /etc/kubernetes/ssl
$ sudo cp ca* /etc/kubernetes/ssl
$
校验证书(这个是个例子)
以校验 kubernetes 证书(后续部署 master 节点时生成的)为例:
使用 openssl 命令
$ openssl x509 -noout -text -in kubernetes.pem
...
Signature Algorithm:sha256WithRSAEncryption
Issuer: C=CN, ST=BeiJing,L=BeiJing, O=k8s, OU=System, CN=Kubernetes
Validity
Not Before: Apr 5 05:36:00 2017 GMT
Not After : Apr 5 05:36:00 2018GMT
Subject: C=CN, ST=BeiJing,L=BeiJing, O=k8s, OU=System, CN=kubernetes
...
X509v3 extensions:
X509v3 Key Usage:critical
Digital Signature, KeyEncipherment
X509v3 Extended KeyUsage:
TLS Web ServerAuthentication, TLS Web Client Authentication
X509v3 Basic Constraints:critical
CA:FALSE
X509v3 Subject KeyIdentifier:
DD:52:04:43:10:13:A9:29:24:17:3A:0E:D7:14:DB:36:F8:6C:E0:E0
X509v3 Authority KeyIdentifier:
keyid:44:04:3B:60:BD:69:78:14:68:AF:A0:41:13:F6:17:07:13:63:58:CD
X509v3 Subject Alternative Name:
DNS:kubernetes,DNS:kubernetes.default, DNS:kubernetes.default.svc,DNS:kubernetes.default.svc.cluster, DNS:kubernetes.default.svc.cluster.local,IP Address:127.0.0.1, IP Address:10.64.3.7, IP Address:10.254.0.1
...
- 确认 Issuer 字段的内容和 ca-csr.json 一致;
- 确认 Subject 字段的内容和 kubernetes-csr.json 一致;
- 确认 X509v3 Subject Alternative Name 字段的内容和 kubernetes-csr.json 一致;
- 确认 X509v3 Key Usage、Extended Key Usage 字段的内容和 ca-config.json 中 kubernetes profile 一致;
使用 cfssl-certinfo 命令
$ cfssl-certinfo -certkubernetes.pem
...
{
"subject":{
"common_name":"kubernetes",
"country":"CN",
"organization":"k8s",
"organizational_unit":"System",
"locality":"BeiJing",
"province":"BeiJing",
"names":[
"CN",
"BeiJing",
"BeiJing",
"k8s",
"System",
"kubernetes"
]
},
"issuer":{
"common_name":"Kubernetes",
"country":"CN",
"organization":"k8s",
"organizational_unit":"System",
"locality":"BeiJing",
"province":"BeiJing",
"names":[
"CN",
"BeiJing",
"BeiJing",
"k8s",
"System",
"Kubernetes"
]
},
"serial_number":"174360492872423263473151971632292895707129022309",
"sans":[
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local",
"127.0.0.1",
"192.168.1.206",
"192.168.1.207",
"192.168.1.208",
"10.254.0.1"
],
"not_before":"2017-04-05T05:36:00Z",
"not_after":"2018-04-05T05:36:00Z",
"sigalg":"SHA256WithRSA",
...
03-部署高可用Etcd集群
部署高可用 etcd 集群
kuberntes 系统使用 etcd存储所有数据,本文档介绍部署一个三节点高可用 etcd 集群的步骤,这三个节点复用 kubernetes master 机器,分别命名为etcd-host0、etcd-host1、etcd-host2:
- etcd-host0:192.168.1.206
- etcd-host1:192.168.1.207
- etcd-host2:192.168.1.208
使用的变量
本文档用到的变量定义如下(一开始已经加入environment.sh脚本里了):
$export NODE_NAME=etcd-host0#当前部署的机器名称(随便定义,只要能区分不同机器即可)
$ export NODE_IP=192.168.1.206# 当前部署的机器 IP
$ export NODE_IPS="192.168.1.206 192.168.1.207 192.168.1.208"# etcd 集群所有机器 IP
$ # etcd 集群间通信的IP和端口
$ export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380
$ #导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR
$ source /root/local/bin/environment.sh
$
下载二进制文件
到 https://github.com/coreos/etcd/releases 页面下载最新版本的二进制文件:
$ wgethttps://github.com/coreos/etcd/releases/download/v3.1.6/etcd-v3.1.6-linux-amd64.tar.gz
$ tar -xvf etcd-v3.1.6-linux-amd64.tar.gz
$ sudo mv etcd-v3.1.6-linux-amd64/etcd*/root/local/bin
$
创建 TLS 秘钥和证书
为了保证通信安全,客户端(如 etcdctl) 与 etcd 集群、etcd集群之间的通信需要使用 TLS 加密,本节创建 etcd TLS 加密所需的证书和私钥。
创建 etcd 证书签名请求:
$ cat>etcd-csr.json<<EOF
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"${NODE_IP}"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
- hosts 字段指定授权使用该证书的 etcd 节点 IP;
生成 etcd 证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \
-ca-key=/etc/kubernetes/ssl/ca-key.pem\
-config=/etc/kubernetes/ssl/ca-config.json \
-profile=kubernetes etcd-csr.json | cfssljson -bare etcd
$ ls etcd*
etcd.csr etcd-csr.json etcd-key.pem etcd.pem
$ sudo mkdir -p /etc/etcd/ssl
$ sudo mv etcd*.pem /etc/etcd/ssl
$ rm etcd.csr etcd-csr.json
创建 etcd 的 systemd unit 文件
$ sudo mkdir -p /var/lib/etcd # 必须先创建工作目录
$ cat > etcd.service<<EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
ExecStart=/root/local/bin/etcd\\
--name=${NODE_NAME} \\
--cert-file=/etc/etcd/ssl/etcd.pem \\
--key-file=/etc/etcd/ssl/etcd-key.pem \\
--peer-cert-file=/etc/etcd/ssl/etcd.pem \\
--peer-key-file=/etc/etcd/ssl/etcd-key.pem \\
--trusted-ca-file=/etc/kubernetes/ssl/ca.pem\\
--peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\
--initial-advertise-peer-urls=https://${NODE_IP}:2380 \\
--listen-peer-urls=https://${NODE_IP}:2380 \\
--listen-client-urls=https://${NODE_IP}:2379,http://127.0.0.1:2379 \\
--advertise-client-urls=https://${NODE_IP}:2379 \\
--initial-cluster-token=etcd-cluster-0 \\
--initial-cluster=${ETCD_NODES} \\
--initial-cluster-state=new \\
--data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
- 指定 etcd 的工作目录和数据目录为 /var/lib/etcd,需在启动服务前创建这个目录;
- 为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书(trusted-ca-file);
- --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;
完整 unit 文件见:etcd.service
启动 etcd 服务
$ sudo mv etcd.service/etc/systemd/system/
$ sudo systemctl daemon-reload
$ sudo systemctl enable etcd
$ sudo systemctl start etcd
$ systemctl status etcd
$
最先启动的 etcd 进程会卡住一段时间,等待其它节点上的 etcd进程加入集群,为正常现象。
在所有的 etcd 节点重复上面的步骤,直到所有机器的 etcd 服务都已启动。
验证服务
部署完 etcd 集群后,在任一 etcd 集群节点上执行如下命令:
$for ipin ${NODE_IPS}; do
ETCDCTL_API=3 /root/local/bin/etcdctl\
--endpoints=https://${ip}:2379 \
--cacert=/etc/kubernetes/ssl/ca.pem\
--cert=/etc/etcd/ssl/etcd.pem\
--key=/etc/etcd/ssl/etcd-key.pem\
endpoint health; done
预期结果:
2017-07-0517:11:58.103401 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
https://192.168.1.206:2379 is healthy:successfully committed proposal: took = 81.247077ms
2017-07-0517:11:58.356539 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
https://192.168.1.207:2379 is healthy:successfully committed proposal: took = 12.073555ms
2017-07-0517:11:58.523829 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
https://192.168.1.208:2379 is healthy:successfully committed proposal: took = 5.413361ms
部署 kubectl 命令行工具
kubectl 默认从 ~/.kube/config 配置文件获取访问kube-apiserver 地址、证书、用户名等信息,如果没有配置该文件,执行命令时出错:
$ kubectl get pods
The connection to the server localhost:8080 was refused - did you specify theright host or port?
本文档介绍下载和配置 kubernetes 集群命令行工具 kubectl 的步骤。
需要将下载的 kubectl二进制程序和生成的 ~/.kube/config 配置文件拷贝到所有使用 kubectl命令的机器。
使用的变量
本文档用到的变量定义如下(一开始已经加入environment.sh脚本里了):
$export MASTER_IP=192.168.1.206 #在主节点 206上操作
$ export KUBE_APISERVER="https://${MASTER_IP}:6443"
$
- 变量 KUBE_APISERVER 指定 kubelet 访问的 kube-apiserver 的地址,后续被写入 ~/.kube/config 配置文件;
下载 kubectl
$ wgethttps://dl.k8s.io/v1.6.2/kubernetes-client-linux-amd64.tar.gz
$ tar -xzvf kubernetes-client-linux-amd64.tar.gz
$ sudo cp kubernetes/client/bin/kube*/root/local/bin/
$ chmod a+x /root/local/bin/kube*
$ export PATH=/root/local/bin:$PATH
$
创建 admin 证书
kubectl 与 kube-apiserver 的安全端口通信,需要为安全通信提供 TLS证书和秘钥。
创建 admin 证书签名请求
$ cat admin-csr.json
{
"CN":"admin",
"hosts": [],
"key": {
"algo":"rsa",
"size": 2048
},
"names": [
{
"C":"CN",
"ST":"BeiJing",
"L":"BeiJing",
"O":"system:masters",
"OU":"System"
}
]
}
- 后续 kube-apiserver 使用 RBAC 对客户端(如 kubelet、kube-proxy、Pod)请求进行授权;
- kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 所有 API的权限;
- O 指定该证书的 Group 为 system:masters,kubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;
- hosts 属性值为空列表;
生成 admin 证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \
-ca-key=/etc/kubernetes/ssl/ca-key.pem\
-config=/etc/kubernetes/ssl/ca-config.json \
-profile=kubernetes admin-csr.json | cfssljson -bare admin
$ ls admin*
admin.csr admin-csr.json admin-key.pem admin.pem
$ sudo mv admin*.pem /etc/kubernetes/ssl/
$ rm admin.csr admin-csr.json
$
创建 kubectl kubeconfig 文件
$# 设置集群参数
$ kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER}
$ # 设置客户端认证参数
$ kubectl config set-credentials admin \
--client-certificate=/etc/kubernetes/ssl/admin.pem \
--embed-certs=true \
--client-key=/etc/kubernetes/ssl/admin-key.pem
$ # 设置上下文参数
$ kubectl config set-context kubernetes \
--cluster=kubernetes \
--user=admin
$ # 设置默认上下文
$ kubectl config use-context kubernetes
- admin.pem 证书 O 字段值为 system:masters,kube-apiserver 预定义的 RoleBinding cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 相关 API 的权限;
- 生成的 kubeconfig 被保存到 ~/.kube/config 文件;
分发 kubeconfig 文件
将 ~/.kube/config 文件拷贝到运行 kubelet 命令的机器的 ~/.kube/ 目录下。
05-部署Flannel网络
部署 Flannel 网络
kubernetes 要求集群内各节点能通过 Pod 网段互联互通,本文档介绍使用Flannel 在所有节点 (Master、Node) 上创建互联互通的 Pod 网段的步骤。
使用的变量
本文档用到的变量定义如下:
$export NODE_IP=192.168.1.206# 当前部署节点的 IP
$ # 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR
$ source /root/local/bin/environment.sh
$
创建 TLS 秘钥和证书
etcd 集群启用了双向 TLS 认证,所以需要为 flanneld 指定与 etcd集群通信的 CA 和秘钥。
创建 flanneld 证书签名请求:
$ cat>flanneld-csr.json<<EOF
{
"CN": "flanneld",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
- hosts 字段为空;
生成 flanneld 证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \
-ca-key=/etc/kubernetes/ssl/ca-key.pem\
-config=/etc/kubernetes/ssl/ca-config.json \
-profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld
$ ls flanneld*
flanneld.csr flanneld-csr.json flanneld-key.pem flanneld.pem
$ sudo mkdir -p /etc/flanneld/ssl
$ sudo mv flanneld*.pem /etc/flanneld/ssl
$ rm flanneld.csr flanneld-csr.json
向 etcd 写入集群 Pod 网段信息
注意:本步骤只需在第一次部署 Flannel 网络时执行,后续在其它节点上部署 Flannel时无需再写入该信息!
$ /root/local/bin/etcdctl \
--endpoints=${ETCD_ENDPOINTS}\
--ca-file=/etc/kubernetes/ssl/ca.pem\
--cert-file=/etc/flanneld/ssl/flanneld.pem\
--key-file=/etc/flanneld/ssl/flanneld-key.pem \
set${FLANNEL_ETCD_PREFIX}/config'{"Network":"'${CLUSTER_CIDR}'", "SubnetLen": 24, "Backend":{"Type": "vxlan"}}'
- flanneld 目前版本 (v0.7.1) 不支持 etcd v3,故使用 etcd v2 API 写入配置 key 和网段数据;
- 写入的 Pod 网段(${CLUSTER_CIDR},172.30.0.0/16) 必须与 kube-controller-manager 的 --cluster-cidr 选项值一致;
安装和配置 flanneld
下载 flanneld
$ mkdir flannel
$ wget https://github.com/coreos/flannel/releases/download/v0.7.1/flannel-v0.7.1-linux-amd64.tar.gz
$ tar -xzvf flannel-v0.7.1-linux-amd64.tar.gz -C flannel
$ sudo cp flannel/{flanneld,mk-docker-opts.sh} /root/local/bin
$
创建 flanneld 的 systemd unit 文件
$ cat>flanneld.service<<EOF
[Unit]
Description=Flanneld overlay address etcdagent
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service
[Service]
Type=notify
ExecStart=/root/local/bin/flanneld \\
-etcd-cafile=/etc/kubernetes/ssl/ca.pem\\
-etcd-certfile=/etc/flanneld/ssl/flanneld.pem \\
-etcd-keyfile=/etc/flanneld/ssl/flanneld-key.pem \\
-etcd-endpoints=${ETCD_ENDPOINTS}\\
-etcd-prefix=${FLANNEL_ETCD_PREFIX}
ExecStartPost=/root/local/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d/run/flannel/docker
Restart=on-failure
[Install]
WantedBy=multi-user.target
RequiredBy=docker.service
EOF
- mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网网段信息写入到 /run/flannel/docker 文件中,后续 docker 启动时使用这个文件中参数值设置 docker0 网桥;
- flanneld 使用系统缺省路由所在的接口和其它节点通信,对于有多个网络接口的机器(如,内网和公网),可以用 -iface选项值指定通信接口(上面的 systemd unit 文件没指定这个选项);
完整 unit 见 flanneld.service
启动 flanneld
$ sudo cp flanneld.service/etc/systemd/system/
$ sudo systemctl daemon-reload
$ sudo systemctl enable flanneld
$ sudo systemctl start flanneld
$ systemctl status flanneld
$
检查 flanneld 服务
$ journalctl -u flanneld|grep'Lease acquired'
$ ifconfig flannel.1
$
检查分配给各 flanneld 的 Pod 网段信息
$# 查看集群 Pod 网段(/16)
$ /root/local/bin/etcdctl \
--endpoints=${ETCD_ENDPOINTS}\
--ca-file=/etc/kubernetes/ssl/ca.pem\
--cert-file=/etc/flanneld/ssl/flanneld.pem\
--key-file=/etc/flanneld/ssl/flanneld-key.pem \
get${FLANNEL_ETCD_PREFIX}/config
{ "Network":"172.30.0.0/16","SubnetLen": 24,"Backend": {"Type":"vxlan" } }
$ # 查看已分配的 Pod 子网段列表(/24)
$ /root/local/bin/etcdctl \
--endpoints=${ETCD_ENDPOINTS}\
--ca-file=/etc/kubernetes/ssl/ca.pem\
--cert-file=/etc/flanneld/ssl/flanneld.pem\
--key-file=/etc/flanneld/ssl/flanneld-key.pem \
ls ${FLANNEL_ETCD_PREFIX}/subnets
2017-07-0517:27:46.007743 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
/kubernetes/network/subnets/172.30.43.0-24
/kubernetes/network/subnets/172.30.44.0-24
/kubernetes/network/subnets/172.30.45.0-24
$ # 查看某一 Pod 网段对应的 flanneld 进程监听的 IP 和网络参数
$ /root/local/bin/etcdctl \
--endpoints=${ETCD_ENDPOINTS}\
--ca-file=/etc/kubernetes/ssl/ca.pem\
--cert-file=/etc/flanneld/ssl/flanneld.pem\
--key-file=/etc/flanneld/ssl/flanneld-key.pem \
get${FLANNEL_ETCD_PREFIX}/subnets/172.30.43.0-24
2017-07-0517:28:34.116874 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
{"PublicIP":"192.168.1.207","BackendType":"vxlan","BackendData":{"VtepMAC":"52:73:8c:2f:ae:3c"}}
确保各节点间 Pod 网段能互联互通
在各节点上部署完 Flannel 后,查看已分配的 Pod 子网段列表(/24)
$ /root/local/bin/etcdctl\
--endpoints=${ETCD_ENDPOINTS}\
--ca-file=/etc/kubernetes/ssl/ca.pem\
--cert-file=/etc/flanneld/ssl/flanneld.pem\
--key-file=/etc/flanneld/ssl/flanneld-key.pem \
ls${FLANNEL_ETCD_PREFIX}/subnets
/kubernetes/network/subnets/172.30.43.0-24
/kubernetes/network/subnets/172.30.44.0-24
/kubernetes/network/subnets/172.30.45.0-24
当前三个节点分配的 Pod 网段分别是:172.30.43.0-24、172.30.44.0-24、172.30.45.0-24。
06-部署Master节点
部署 master 节点
kubernetes master 节点包含的组件:
- kube-apiserver
- kube-scheduler
- kube-controller-manager
目前这三个组件需要部署在同一台机器上:
- kube-scheduler、kube-controller-manager 和 kube-apiserver 三者的功能紧密相关;
- 同时只能有一个 kube-scheduler、kube-controller-manager 进程处于工作状态,如果运行多个,则需要通过选举产生一个 leader;
本文档介绍部署单机 kubernetes master 节点的步骤,没有实现高可用master 集群。
计划后续再介绍部署 LB 的步骤,客户端(kubectl、kubelet、kube-proxy) 使用 LB 的 VIP 来访问 kube-apiserver,从而实现高可用 master 集群。
master 节点与 node 节点上的 Pods 通过 Pod 网络通信,所以需要在master 节点上部署 Flannel 网络。
使用的变量
本文档用到的变量定义如下:
$export MASTER_IP=192.168.1.206 # 替换为当前部署的 master 机器 IP
$ #导入用到的其它全局变量:SERVICE_CIDR、CLUSTER_CIDR、NODE_PORT_RANGE、ETCD_ENDPOINTS、BOOTSTRAP_TOKEN
$ source /root/local/bin/environment.sh
$
下载最新版本的二进制文件
有两种下载方式:
- 从 github release 页面 下载发布版 tarball,解压后再执行下载脚本
$ wgethttps://github.com/kubernetes/kubernetes/releases/download/v1.6.2/kubernetes.tar.gz
$ tar -xzvf kubernetes.tar.gz
...
$ cd kubernetes
$ ./cluster/get-kube-binaries.sh
... - 从 CHANGELOG页面 下载 client 或 server tarball 文件
server 的 tarball kubernetes-server-linux-amd64.tar.gz 已经包含了 client(kubectl) 二进制文件,所以不用单独下载kubernetes-client-linux-amd64.tar.gz文件;
$# wgethttps://dl.k8s.io/v1.6.2/kubernetes-client-linux-amd64.tar.gz
$ wget https://dl.k8s.io/v1.6.2/kubernetes-server-linux-amd64.tar.gz
$ tar -xzvf kubernetes-server-linux-amd64.tar.gz
...
$ cd kubernetes
$ tar -xzvf kubernetes-src.tar.gz
将二进制文件拷贝到指定路径:
$ sudo cp -rserver/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubectl,kube-proxy,kubelet}/root/local/bin/
$
安装和配置 flanneld
创建 kubernetes 证书
创建 kubernetes 证书签名请求
$ cat>kubernetes-csr.json<<EOF
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"${MASTER_IP}",
"${CLUSTER_KUBERNETES_SVC_IP}",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
- 如果 hosts 字段不为空则需要指定授权使用该证书的 IP 或域名列表,所以上面分别指定了当前部署的 master 节点主机 IP;
- 还需要添加 kube-apiserver 注册的名为 kubernetes 的服务 IP (Service Cluster IP),一般是 kube-apiserver --service-cluster-ip-range 选项值指定的网段的第一个IP,如 "10.254.0.1";
$ kubectl get svc kubernetes
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.254.0.1 <none> 443/TCP 1d
生成 kubernetes 证书和私钥
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \
-ca-key=/etc/kubernetes/ssl/ca-key.pem\
-config=/etc/kubernetes/ssl/ca-config.json \
-profile=kubernetes kubernetes-csr.json| cfssljson -bare kubernetes
$ ls kubernetes*
kubernetes.csr kubernetes-csr.json kubernetes-key.pem kubernetes.pem
$ sudo mkdir -p /etc/kubernetes/ssl/
$ sudo mv kubernetes*.pem /etc/kubernetes/ssl/
$ rm kubernetes.csr kubernetes-csr.json
配置和启动 kube-apiserver
创建 kube-apiserver 使用的客户端 token 文件
kubelet 首次启动时向 kube-apiserver发送 TLS Bootstrapping 请求,kube-apiserver 验证 kubelet 请求中的 token 是否与它配置的 token.csv一致,如果一致则自动为 kubelet生成证书和秘钥。(这个token只要master做一次)
$# 导入的 environment.sh 文件定义了 BOOTSTRAP_TOKEN 变量
$ cat > token.csv<<EOF
${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:kubelet-bootstrap"
EOF
$ mv token.csv /etc/kubernetes/
$
创建 kube-apiserver 的 systemd unit 文件
$ cat > kube-apiserver.service<<EOF
[Unit]
Description=KubernetesAPI Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
[Service]
ExecStart=/root/local/bin/kube-apiserver\\
--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota\\
--advertise-address=${MASTER_IP} \\
--bind-address=${MASTER_IP} \\
--insecure-bind-address=${MASTER_IP} \\
--authorization-mode=RBAC \\
--runtime-config=rbac.authorization.k8s.io/v1alpha1 \\
--kubelet-https=true \\
--experimental-bootstrap-token-auth \\
--token-auth-file=/etc/kubernetes/token.csv\\
--service-cluster-ip-range=${SERVICE_CIDR} \\
--service-node-port-range=${NODE_PORT_RANGE}\\
--tls-cert-file=/etc/kubernetes/ssl/kubernetes.pem \\
--tls-private-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\
--client-ca-file=/etc/kubernetes/ssl/ca.pem\\
--service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \\
--etcd-cafile=/etc/kubernetes/ssl/ca.pem \\
--etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem \\
--etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem \\
--etcd-servers=${ETCD_ENDPOINTS} \\
--enable-swagger-ui=true \\
--allow-privileged=true \\
--apiserver-count=3 \\
--audit-log-maxage=30 \\
--audit-log-maxbackup=3 \\
--audit-log-maxsize=100 \\
--audit-log-path=/var/lib/audit.log \\
--event-ttl=1h \\
--v=2
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
- kube-apiserver 1.6 版本开始使用 etcd v3 API 和存储格式;
- --authorization-mode=RBAC 指定在安全端口使用 RBAC 授权模式,拒绝未通过授权的请求;
- kube-scheduler、kube-controller-manager 一般和 kube-apiserver 部署在同一台机器上,它们使用非安全端口和 kube-apiserver通信;
- kubelet、kube-proxy、kubectl 部署在其它 Node 节点上,如果通过安全端口访问 kube-apiserver,则必须先通过 TLS 证书认证,再通过 RBAC 授权;
- kube-proxy、kubectl 通过在使用的证书里指定相关的 User、Group 来达到通过 RBAC 授权的目的;
- 如果使用了 kubelet TLS Boostrap 机制,则不能再指定 --kubelet-certificate-authority、--kubelet-client-certificate 和 --kubelet-client-key 选项,否则后续 kube-apiserver 校验 kubelet 证书时出现 ”x509: certificate signed by unknown authority“ 错误;
- --admission-control 值必须包含 ServiceAccount,否则部署集群插件时会失败;
- --bind-address 不能为 127.0.0.1;
- --service-cluster-ip-range 指定 Service Cluster IP 地址段,该地址段不能路由可达;
- --service-node-port-range=${NODE_PORT_RANGE} 指定 NodePort 的端口范围;
- 缺省情况下 kubernetes 对象保存在 etcd /registry 路径下,可以通过 --etcd-prefix 参数进行调整;
完整 unit 见 kube-apiserver.service
启动 kube-apiserver
$ sudo cp kube-apiserver.service/etc/systemd/system/
$ sudo systemctl daemon-reload
$ sudo systemctl enable kube-apiserver
$ sudo systemctl start kube-apiserver
$ sudo systemctl status kube-apiserver
$
配置和启动 kube-controller-manager
创建 kube-controller-manager 的 systemd unit 文件
$ cat>kube-controller-manager.service<<EOF
[Unit]
Description=KubernetesController Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
ExecStart=/root/local/bin/kube-controller-manager\\
--address=127.0.0.1 \\
--master=http://${MASTER_IP}:8080 \\
--allocate-node-cidrs=true \\
--service-cluster-ip-range=${SERVICE_CIDR} \\
--cluster-cidr=${CLUSTER_CIDR} \\
--cluster-name=kubernetes \\
--cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \\
--cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \\
--service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \\
--root-ca-file=/etc/kubernetes/ssl/ca.pem \\
--leader-elect=true \\
--v=2
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
- --address 值必须为 127.0.0.1,因为当前 kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器,否则:
$ kubectl get componentstatuses
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: getsockopt: connection refused
scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: getsockopt: connection refused
参考:https://github.com/kubernetes-incubator/bootkube/issues/64 - --master=http://{MASTER_IP}:8080:使用非安全 8080 端口与 kube-apiserver 通信;
- --cluster-cidr 指定 Cluster 中 Pod 的 CIDR 范围,该网段在各 Node 间必须路由可达(flanneld保证);
- --service-cluster-ip-range 参数指定 Cluster 中 Service 的CIDR范围,该网络在各 Node 间必须路由不可达,必须和 kube-apiserver 中的参数一致;
- --cluster-signing-* 指定的证书和私钥文件用来签名为 TLS BootStrap 创建的证书和私钥;
- --root-ca-file 用来对 kube-apiserver 证书进行校验,指定该参数后,才会在Pod 容器的 ServiceAccount 中放置该 CA 证书文件;
- --leader-elect=true 部署多台机器组成的 master 集群时选举产生一处于工作状态的 kube-controller-manager 进程;
完整 unit 见 kube-controller-manager.service
启动 kube-controller-manager
$ sudo cp kube-controller-manager.service /etc/systemd/system/
$ sudo systemctl daemon-reload
$ sudo systemctl enable kube-controller-manager
$ sudo systemctl start kube-controller-manager
$
配置和启动 kube-scheduler
创建 kube-scheduler 的 systemd unit 文件
$ cat>kube-scheduler.service<<EOF
[Unit]
Description=KubernetesScheduler
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
ExecStart=/root/local/bin/kube-scheduler\\
--address=127.0.0.1 \\
--master=http://${MASTER_IP}:8080 \\
--leader-elect=true \\
--v=2
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
- --address 值必须为 127.0.0.1,因为当前 kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器;
- --master=http://{MASTER_IP}:8080:使用非安全 8080 端口与 kube-apiserver 通信;
- --leader-elect=true 部署多台机器组成的 master 集群时选举产生一处于工作状态的 kube-controller-manager 进程;
完整 unit 见 kube-scheduler.service。
启动 kube-scheduler
$ sudo cp kube-scheduler.service/etc/systemd/system/
$ sudo systemctl daemon-reload
$ sudo systemctl enable kube-scheduler
$ sudo systemctl start kube-scheduler
$
验证 master 节点功能
$ kubectl getcomponentstatuses
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
07-部署Node节点
部署 Node 节点
kubernetes Node 节点包含如下组件:
- flanneld
- docker
- kubelet
- kube-proxy
使用的变量
本文档用到的变量定义如下:
$ # 替换为 kubernetesmaster 集群任一机器 IP
$ export MASTER_IP=192.168.1.206
$ export KUBE_APISERVER="https://${MASTER_IP}:6443"
$ # 当前部署的节点 IP
$ export NODE_IP=192.168.1.206
$ #导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR、CLUSTER_DNS_SVC_IP、CLUSTER_DNS_DOMAIN、SERVICE_CIDR
$ source /root/local/bin/environment.sh
$
安装和配置 flanneld
安装和配置 docker
下载最新的 docker 二进制文件
$ wget https://get.docker.com/builds/Linux/x86_64/docker-17.04.0-ce.tgz
$ tar -xvf docker-17.04.0-ce.tgz
$ cp docker/docker* /root/local/bin
$ cp docker/completion/bash/docker /etc/bash_completion.d/
$
创建 docker 的 systemd unit 文件
$ catdocker.service
[Unit]
Description=Docker Application ContainerEngine
Documentation=http://docs.docker.io
[Service]
Environment="PATH=/root/local/bin:/bin:/sbin:/usr/bin:/usr/sbin"
EnvironmentFile=-/run/flannel/docker
ExecStart=/root/local/bin/dockerd --log-level=error$DOCKER_NETWORK_OPTIONS
ExecReload=/bin/kill -s HUP$MAINPID
Restart=on-failure
RestartSec=5
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
Delegate=yes
KillMode=process
[Install]
WantedBy=multi-user.target
- dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;
- flanneld 启动时将网络配置写入到 /run/flannel/docker 文件中的变量 DOCKER_NETWORK_OPTIONS,dockerd 命令行上指定该变量值来设置 docker0 网桥参数;
- 如果指定了多个 EnvironmentFile 选项,则必须将 /run/flannel/docker 放在最后(确保 docker0 使用 flanneld 生成的 bip 参数);
- 不能关闭默认开启的 --iptables 和 --ip-masq 选项;
- 如果内核版本比较新,建议使用 overlay 存储驱动;
- docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT:
$ sudo iptables -P FORWARD ACCEPT
$ - 为了加快 pull image 的速度,可以使用国内的仓库镜像服务器,同时增加下载的并发数。(如果 dockerd 已经运行,则需要重启 dockerd 生效。)
$ cat /etc/docker/daemon.json
{
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","hub-mirror.c.163.com"],
"max-concurrent-downloads": 10
}
完整 unit 见 docker.service
启动 dockerd
$ sudo cp docker.service/etc/systemd/system/docker.service
$ sudo systemctl daemon-reload
$ sudo systemctl stop firewalld
$ sudo systemctl disable firewalld
$ sudo iptables -F && sudo iptables -X&& sudo iptables -F -t nat&& sudo iptables -X -t nat
$ sudo systemctl enable docker
$ sudo systemctl start docker
$
- 需要关闭 firewalld,否则可能会重复创建的 iptables 规则;
- 最好清理旧的 iptables rules 和 chains 规则;
检查 docker 服务
$ docker version
$
安装和配置 kubelet
kubelet 启动时向 kube-apiserver 发送 TLSbootstrapping 请求,需要先将 bootstrap token 文件中的 kubelet-bootstrap 用户赋予system:node-bootstrapper 角色,然后 kubelet 才有权限创建认证请求(certificatesigningrequests):(这个只要在第一个node节点上执行一次)
$ kubectl create clusterrolebindingkubelet-bootstrap --clusterrole=system:node-bootstrapper--user=kubelet-bootstrap
$
- --user=kubelet-bootstrap 是文件 /etc/kubernetes/token.csv 中指定的用户名,同时也写入了文件 /etc/kubernetes/bootstrap.kubeconfig;
下载最新的 kubelet 和 kube-proxy 二进制文件
$ wget https://dl.k8s.io/v1.6.2/kubernetes-server-linux-amd64.tar.gz
$ tar -xzvf kubernetes-server-linux-amd64.tar.gz
$ cd kubernetes
$ tar -xzvf kubernetes-src.tar.gz
$ sudo cp -r ./server/bin/{kube-proxy,kubelet} /root/local/bin/
$
创建 kubelet bootstrapping kubeconfig 文件
$ # 设置集群参数
$ kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=bootstrap.kubeconfig
$ # 设置客户端认证参数
$ kubectl config set-credentials kubelet-bootstrap \
--token=${BOOTSTRAP_TOKEN} \
--kubeconfig=bootstrap.kubeconfig
$ # 设置上下文参数
$ kubectl config set-context default \
--cluster=kubernetes \
--user=kubelet-bootstrap \
--kubeconfig=bootstrap.kubeconfig
$ # 设置默认上下文
$ kubectl config use-context default --kubeconfig=bootstrap.kubeconfig
$ mv bootstrap.kubeconfig /etc/kubernetes/
- --embed-certs 为 true 时表示将 certificate-authority 证书写入到生成的 bootstrap.kubeconfig 文件中;
- 设置 kubelet 客户端认证参数时没有指定秘钥和证书,后续由 kube-apiserver 自动生成;
创建 kubelet 的 systemd unit 文件
$ sudo mkdir /var/lib/kubelet # 必须先创建工作目录
$ cat > kubelet.service<<EOF
[Unit]
Description=KubernetesKubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service
[Service]
WorkingDirectory=/var/lib/kubelet
ExecStart=/root/local/bin/kubelet\\
--address=${NODE_IP} \\
--hostname-override=${NODE_IP} \\
--pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest\\
--experimental-bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig\\
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\
--require-kubeconfig \\
--cert-dir=/etc/kubernetes/ssl \\
--cluster-dns=${CLUSTER_DNS_SVC_IP} \\
--cluster-domain=${CLUSTER_DNS_DOMAIN} \\
--hairpin-mode promiscuous-bridge \\
--allow-privileged=true \\
--serialize-image-pulls=false \\
--logtostderr=true \\
--v=2
ExecStopPost=/sbin/iptables-A INPUT -s 10.0.0.0/8 -p tcp --dport 4194 -j ACCEPT
ExecStopPost=/sbin/iptables-A INPUT -s 172.16.0.0/12 -p tcp --dport 4194 -j ACCEPT
ExecStopPost=/sbin/iptables-A INPUT -s 192.168.0.0/16 -p tcp --dport 4194 -j ACCEPT
ExecStopPost=/sbin/iptables-A INPUT -p tcp --dport 4194 -j DROP
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
- --address 不能设置为 127.0.0.1,否则后续 Pods 访问 kubelet 的 API 接口时会失败,因为 Pods 访问的 127.0.0.1 指向自己而不是 kubelet;
- 如果设置了 --hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况;
- --experimental-bootstrap-kubeconfig 指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
- 管理员通过了 CSR 请求后,kubelet 自动在 --cert-dir 目录创建证书和私钥文件(kubelet-client.crt 和 kubelet-client.key),然后写入 --kubeconfig 文件(自动创建 --kubeconfig 指定的文件);
- 建议在 --kubeconfig 配置文件中指定 kube-apiserver 地址,如果未指定 --api-servers 选项,则必须指定 --require-kubeconfig 选项后才从配置文件中读取 kue-apiserver 的地址,否则 kubelet 启动后将找不到 kube-apiserver (日志中提示未找到 API Server),kubectl get nodes 不会返回对应的 Node 信息;
- --cluster-dns 指定 kubedns 的 Service IP(可以先分配,后续创建 kubedns 服务时指定该 IP),--cluster-domain 指定域名后缀,这两个参数同时指定后才会生效;
- kubelet cAdvisor 默认在所有接口监听 4194 端口的请求,对于有外网的机器来说不安全,ExecStopPost 选项指定的 iptables 规则只允许内网机器访问 4194 端口;
完整 unit 见 kubelet.service
启动 kubelet
$ sudo cp kubelet.service/etc/systemd/system/kubelet.service
$ sudo systemctl daemon-reload
$ sudo systemctl enable kubelet
$ sudo systemctl start kubelet
$ systemctl status kubelet
$
通过 kubelet 的 TLS 证书请求
kubelet 首次启动时向 kube-apiserver 发送证书签名请求,必须通过后kubernetes 系统才会将该 Node 加入到集群。
查看未授权的 CSR 请求:
$ kubectl get csr
NAME AGE REQUESTOR CONDITION
csr-2b308 4m kubelet-bootstrap Pending
$ kubectl get nodes
No resources found.
通过 CSR 请求:
$ kubectl certificate approvecsr-2b308
certificatesigningrequest "csr-2b308" approved
$ kubectl get nodes
NAME STATUS AGE VERSION
10.64.3.7 Ready 49m v1.6.2
自动生成了 kubelet kubeconfig 文件和公私钥:
$ ls -l/etc/kubernetes/kubelet.kubeconfig
-rw------- 1 root root 2284 Apr 7 02:07/etc/kubernetes/kubelet.kubeconfig
$ ls -l /etc/kubernetes/ssl/kubelet*
-rw-r--r-- 1 root root 1046 Apr 7 02:07/etc/kubernetes/ssl/kubelet-client.crt
-rw------- 1 root root 227 Apr 7 02:04 /etc/kubernetes/ssl/kubelet-client.key
-rw-r--r-- 1 root root 1103 Apr 7 02:07/etc/kubernetes/ssl/kubelet.crt
-rw------- 1 root root 1675 Apr 7 02:07/etc/kubernetes/ssl/kubelet.key
配置 kube-proxy
创建 kube-proxy 证书
创建 kube-proxy 证书签名请求:
$ catkube-proxy-csr.json
{
"CN":"system:kube-proxy",
"hosts": [],
"key": {
"algo":"rsa",
"size": 2048
},
"names": [
{
"C":"CN",
"ST":"BeiJing",
"L":"BeiJing",
"O":"k8s",
"OU":"System"
}
]
}
- CN 指定该证书的 User 为 system:kube-proxy;
- kube-apiserver 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
- hosts 属性值为空列表;
生成 kube-proxy 客户端证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \
-ca-key=/etc/kubernetes/ssl/ca-key.pem\
-config=/etc/kubernetes/ssl/ca-config.json \
-profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
$ ls kube-proxy*
kube-proxy.csr kube-proxy-csr.json kube-proxy-key.pem kube-proxy.pem
$ sudo mv kube-proxy*.pem /etc/kubernetes/ssl/
$ rm kube-proxy.csr kube-proxy-csr.json
$
创建 kube-proxy kubeconfig 文件
$ # 设置集群参数
$ kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=kube-proxy.kubeconfig
$ # 设置客户端认证参数
$ kubectl config set-credentials kube-proxy \
--client-certificate=/etc/kubernetes/ssl/kube-proxy.pem \
--client-key=/etc/kubernetes/ssl/kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=kube-proxy.kubeconfig
$ # 设置上下文参数
$ kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=kube-proxy.kubeconfig
$ # 设置默认上下文
$ kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
$ mv kube-proxy.kubeconfig /etc/kubernetes/
- 设置集群参数和客户端认证参数时 --embed-certs 都为 true,这会将 certificate-authority、client-certificate 和 client-key 指向的证书文件内容写入到生成的 kube-proxy.kubeconfig 文件中;
- kube-proxy.pem 证书中 CN 为 system:kube-proxy,kube-apiserver 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
创建 kube-proxy 的 systemd unit 文件
$ sudo mkdir -p /var/lib/kube-proxy# 必须先创建工作目录
$ cat > kube-proxy.service<<EOF
[Unit]
Description=KubernetesKube-Proxy Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
[Service]
WorkingDirectory=/var/lib/kube-proxy
ExecStart=/root/local/bin/kube-proxy\\
--bind-address=${NODE_IP} \\
--hostname-override=${NODE_IP} \\
--cluster-cidr=${SERVICE_CIDR} \\
--kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig \\
--logtostderr=true \\
--v=2
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
- --hostname-override 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 iptables 规则;
- --cluster-cidr 必须与 kube-apiserver 的 --service-cluster-ip-range 选项值一致;
- kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr 或 --masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
- --kubeconfig 指定的配置文件嵌入了 kube-apiserver 的地址、用户名、证书、秘钥等请求和认证信息;
- 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
完整 unit 见 kube-proxy.service
启动 kube-proxy
$ sudo cp kube-proxy.service/etc/systemd/system/
$ sudo systemctl daemon-reload
$ sudo systemctl enable kube-proxy
$ sudo systemctl start kube-proxy
$ systemctl status kube-proxy
$
验证集群功能
定义文件:
$ cat nginx-ds.yml
apiVersion: v1
kind: Service
metadata:
name: nginx-ds
labels:
app: nginx-ds
spec:
type: NodePort
selector:
app: nginx-ds
ports:
- name: http
port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: nginx-ds
labels:
addonmanager.kubernetes.io/mode:Reconcile
spec:
template:
metadata:
labels:
app: nginx-ds
spec:
containers:
- name: my-nginx
image: nginx:1.7.9
ports:
- containerPort: 80
创建 Pod 和服务:
$ kubectl create -fnginx-ds.yml
service "nginx-ds" created
daemonset "nginx-ds" created
检查节点状态
$ kubectl get nodes
NAME STATUS AGE VERSION
192.168.1.206 Ready 1d v1.6.2
192.168.1.207 Ready 1d v1.6.2
192.168.1.208 Ready 1d v1.6.2
都为Ready 时正常。
检查各 Node 上的 Pod IP 连通性
$ kubectl get pods -o wide|grepnginx-ds
nginx-ds-6ktz8 1/1 Running 0 5m 172.30.43.19 192.168.1.206
nginx-ds-6ktz9 1/1 Running 0 5m 172.30.44.20 192.168.1.207
可见,nginx-ds 的 PodIP 分别是 172.30.43.19、172.30.44.20,在所有 Node 上分别 ping 这两个 IP,看是否连通。
检查服务 IP 和端口可达性
$ kubectl get svc |grep nginx-ds
nginx-ds 10.254.136.178 <nodes> 80:8744/TCP 11m
可见:
- 服务IP:10.254.136.178
- 服务端口:80
- NodePort端口:8744
在所有 Node 上执行:
$ curl 10.254.136.178 # `kubectl get svc |grep nginx-ds`输出中的服务 IP
$
预期输出 nginx 欢迎页面内容。
检查服务的 NodePort 可达性
在所有 Node 上执行:
$ export NODE_IP=192.168.1.207# 当前 Node 的 IP
$ export NODE_PORT=8744# `kubectl get svc|grep nginx-ds` 输出中 80 端口映射的 NodePort
$ curl ${NODE_IP}:${NODE_PORT}
$
预期输出 nginx 欢迎页面内容。
08-部署DNS插件
部署 kubedns 插件
官方文件目录:kubernetes/cluster/addons/dns
使用的文件:
$ ls*.yaml*.base
kubedns-cm.yaml kubedns-sa.yaml kubedns-controller.yaml.base kubedns-svc.yaml.base
已经修改好的 yaml 文件见:dns。
系统预定义的 RoleBinding
预定义的 RoleBinding system:kube-dns 将 kube-system命名空间的 kube-dns ServiceAccount 与 system:kube-dns Role 绑定, 该 Role 具有访问kube-apiserver DNS 相关 API 的权限;
$ kubectl get clusterrolebindingssystem:kube-dns -o yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp:2017-04-06T17:40:47Z
labels:
kubernetes.io/bootstrapping:rbac-defaults
name: system:kube-dns
resourceVersion: "56"
selfLink:/apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindingssystem%3Akube-dns
uid:2b55cdbe-1af0-11e7-af35-8cdcd4b3be48
roleRef:
apiGroup:rbac.authorization.k8s.io
kind: ClusterRole
name:system:kube-dns
subjects:
- kind: ServiceAccount
name: kube-dns
namespace: kube-system
kubedns-controller.yaml 中定义的 Pods 时使用了 kubedns-sa.yaml 文件定义的 kube-dns ServiceAccount,所以具有访问kube-apiserver DNS 相关 API 的权限;
配置 kube-dns ServiceAccount
无需修改;
配置 kube-dns 服务
$ diff kubedns-svc.yaml.basekubedns-svc.yaml
30c30
< clusterIP: __PILLAR__DNS__SERVER__
---
> clusterIP: 10.254.0.2
- 需要将 spec.clusterIP 设置为集群环境变量中变量 CLUSTER_DNS_SVC_IP 值,这个 IP 需要和 kubelet 的 —cluster-dns 参数值一致;
配置 kube-dns Deployment
$ diff kubedns-controller.yaml.basekubedns-controller.yaml
58c58
< image:gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.1
---
> image:xuejipeng/k8s-dns-kube-dns-amd64:v1.14.1
88c88
< ---domain=__PILLAR__DNS__DOMAIN__.
---
> ---domain=cluster.local.
92c92
< __PILLAR__FEDERATIONS__DOMAIN__MAP__
---
> #__PILLAR__FEDERATIONS__DOMAIN__MAP__
110c110
< image:gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.1
---
> image:xuejipeng/k8s-dns-dnsmasq-nanny-amd64:v1.14.1
129c129
< ---server=/__PILLAR__DNS__DOMAIN__/127.0.0.1#10053
---
> ---server=/cluster.local./127.0.0.1#10053
148c148
< image:gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.1
---
> image:xuejipeng/k8s-dns-sidecar-amd64:v1.14.1
161,162c161,162
< ---probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.__PILLAR__DNS__DOMAIN__,5,A
< ---probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.__PILLAR__DNS__DOMAIN__,5,A
---
> ---probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local.,5,A
> ---probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local.,5,A
- --domain 为集群环境文档 变量 CLUSTER_DNS_DOMAIN 的值;
- 使用系统已经做了 RoleBinding 的 kube-dns ServiceAccount,该账户具有访问 kube-apiserver DNS 相关 API 的权限;
执行所有定义文件
$pwd
/root/kubernetes-git/cluster/addons/dns
$ ls *.yaml
kubedns-cm.yaml kubedns-controller.yaml kubedns-sa.yaml kubedns-svc.yaml
$ kubectl create -f .
$
检查 kubedns 功能
新建一个 Deployment
$ cat my-nginx.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-nginx
spec:
replicas: 2
template:
metadata:
labels:
run: my-nginx
spec:
containers:
- name: my-nginx
image: nginx:1.7.9
ports:
- containerPort: 80
$ kubectl create -f my-nginx.yaml
$
Export 该 Deployment,生成 my-nginx 服务
$ kubectl expose deploymy-nginx
$ kubectl get services --all-namespaces |grepmy-nginx
default my-nginx 10.254.86.48 <none> 80/TCP 1d
创建另一个 Pod,查看 /etc/resolv.conf 是否包含 kubelet 配置的 --cluster-dns 和 --cluster-domain,是否能够将服务 my-nginx 解析到上面显示的 ClusterIP 10.254.86.48
$ cat pod-nginx.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
$ kubectl create -f pod-nginx.yaml
$ kubectl exec nginx -i -t -- /bin/bash
root@nginx:/# cat/etc/resolv.conf
nameserver 10.254.0.2
search default.svc.cluster.local svc.cluster.local cluster.localtjwq01.ksyun.com
options ndots:5
root@nginx:/# ping my-nginx
PING my-nginx.default.svc.cluster.local (10.254.86.48): 48 databytes
^C--- my-nginx.default.svc.cluster.local ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
root@nginx:/# ping kubernetes
PING kubernetes.default.svc.cluster.local (10.254.0.1): 48 databytes
^C--- kubernetes.default.svc.cluster.local ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
root@nginx:/# pingkube-dns.kube-system.svc.cluster.local
PING kube-dns.kube-system.svc.cluster.local (10.254.0.2): 48 databytes
^C--- kube-dns.kube-system.svc.cluster.local ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
附件:
kubedns-cm.yaml
apiVersion: v1 kind: ConfigMap metadata: name: kube-dns namespace: kube-system labels: addonmanager.kubernetes.io/mode: EnsureExists |
kubedns-controller.yaml
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: kube-dns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: # replicas: not specified here: # 1. In order to make Addon Manager do not reconcile this replicas parameter. # 2. Default is 1. # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on. strategy: rollingUpdate: maxSurge: 10% maxUnavailable: 0 selector: matchLabels: k8s-app: kube-dns template: metadata: labels: k8s-app: kube-dns annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: tolerations: - key: "CriticalAddonsOnly" operator: "Exists" volumes: - name: kube-dns-config configMap: name: kube-dns optional: true containers: - name: kubedns image: xuejipeng/k8s-dns-kube-dns-amd64:v1.14.1 resources: # TODO: Set memory limits when we've profiled the container for large # clusters, then set request = limit to keep this container in # guaranteed class. Currently, this container falls into the # "burstable" category so the kubelet doesn't backoff from restarting it. limits: memory: 170Mi requests: cpu: 100m memory: 70Mi livenessProbe: httpGet: path: /healthcheck/kubedns port: 10054 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 readinessProbe: httpGet: path: /readiness port: 8081 scheme: HTTP # we poll on pod startup for the Kubernetes master service and # only setup the /readiness HTTP server once that's available. initialDelaySeconds: 3 timeoutSeconds: 5 args: - --domain=cluster.local. - --dns-port=10053 - --config-dir=/kube-dns-config - --v=2 #__PILLAR__FEDERATIONS__DOMAIN__MAP__ env: - name: PROMETHEUS_PORT value: "10055" ports: - containerPort: 10053 name: dns-local protocol: UDP - containerPort: 10053 name: dns-tcp-local protocol: TCP - containerPort: 10055 name: metrics protocol: TCP volumeMounts: - name: kube-dns-config mountPath: /kube-dns-config - name: dnsmasq image: xuejipeng/k8s-dns-dnsmasq-nanny-amd64:v1.14.1 livenessProbe: httpGet: path: /healthcheck/dnsmasq port: 10054 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 args: - -v=2 - -logtostderr - -configDir=/etc/k8s/dns/dnsmasq-nanny - -restartDnsmasq=true - -- - -k - --cache-size=1000 - --log-facility=- - --server=/cluster.local./127.0.0.1#10053 - --server=/in-addr.arpa/127.0.0.1#10053 - --server=/ip6.arpa/127.0.0.1#10053 ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP # see: https://github.com/kubernetes/kubernetes/issues/29055 for details resources: requests: cpu: 150m memory: 20Mi volumeMounts: - name: kube-dns-config mountPath: /etc/k8s/dns/dnsmasq-nanny - name: sidecar image: xuejipeng/k8s-dns-sidecar-amd64:v1.14.1 livenessProbe: httpGet: path: /metrics port: 10054 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 args: - --v=2 - --logtostderr - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local.,5,A - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local.,5,A ports: - containerPort: 10054 name: metrics protocol: TCP resources: requests: memory: 20Mi cpu: 10m dnsPolicy: Default # Don't use cluster DNS. serviceAccountName: kube-dns |
kubedns-sa.yaml
apiVersion: v1 kind: ServiceAccount metadata: name: kube-dns namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile |
kubedns-svc.yaml
apiVersion: v1 kind: Service metadata: name: kube-dns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "KubeDNS" spec: selector: k8s-app: kube-dns clusterIP: 10.254.0.2 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP |
09-部署Dashboard插件
部署 dashboard 插件
官方文件目录:kubernetes/cluster/addons/dashboard
使用的文件:
$ ls*.yaml
dashboard-controller.yaml dashboard-rbac.yaml dashboard-service.yaml
- 新加了 dashboard-rbac.yaml 文件,定义 dashboard 使用的 RoleBinding。
由于 kube-apiserver 启用了 RBAC 授权,而官方源码目录的 dashboard-controller.yaml 没有定义授权的ServiceAccount,所以后续访问 kube-apiserver 的 API 时会被拒绝,前端界面提示:
解决办法是:定义一个名为 dashboard 的ServiceAccount,然后将它和 Cluster Role view 绑定,具体参考 dashboard-rbac.yaml文件。
已经修改好的 yaml 文件见:dashboard。
配置dashboard-service
$ diff dashboard-service.yaml.origdashboard-service.yaml
10a11
> type: NodePort
- 指定端口类型为 NodePort,这样外界可以通过地址 nodeIP:nodePort 访问 dashboard;
配置dashboard-controller
20a21
> serviceAccountName: dashboard
23c24
< image:gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.0
---
> image:cokabug/kubernetes-dashboard-amd64:v1.6.0
- 使用名为 dashboard 的自定义 ServiceAccount;
执行所有定义文件
$pwd
/root/kubernetes/cluster/addons/dashboard
$ ls *.yaml
dashboard-controller.yaml dashboard-rbac.yaml dashboard-service.yaml
$ kubectl create -f .
$
检查执行结果
查看分配的 NodePort
$ kubectl get serviceskubernetes-dashboard -n kube-system
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard 10.254.224.130 <nodes> 80:30312/TCP 25s
- NodePort 30312映射到 dashboard pod 80端口;
检查 controller
$ kubectl get deploymentkubernetes-dashboard -nkube-system
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kubernetes-dashboard 1 1 1 1 3m
$ kubectl get pods -n kube-system | grepdashboard
kubernetes-dashboard-1339745653-pmn6z 1/1 Running 0 4m
访问dashboard
- kubernetes-dashboard 服务暴露了 NodePort,可以使用 http://NodeIP:nodePort 地址访问 dashboard;
- 通过 kube-apiserver 访问 dashboard;
- 通过 kubectl proxy 访问 dashboard:
通过 kubectl proxy 访问 dashboard
启动代理
$ kubectl proxy --address='192.168.1.206'--port=8086 --accept-hosts='^*$'
Starting to serve on 192.168.1.206:8086
- 需要指定 --accept-hosts 选项,否则浏览器访问 dashboard 页面时提示 “Unauthorized”;
浏览器访问 URL:http://192.168.1.206:8086/ui 自动跳转到:http://192.168.1.206:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default
通过 kube-apiserver 访问dashboard
获取集群服务地址列表
$ kubectlcluster-info
Kubernetes master is running at https://192.168.1.206:6443
KubeDNS is running at https://192.168.1.206:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at https://192.168.1.206:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
由于 kube-apiserver 开启了 RBAC 授权,而浏览器访问kube-apiserver 的时候使用的是匿名证书,所以访问安全端口会导致授权失败。这里需要使用非安全端口访问 kube-apiserver:
浏览器访问 URL:http://192.168.1.206:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
由于缺少 Heapster 插件,当前dashboard 不能展示 Pod、Nodes 的 CPU、内存等 metric 图形;
附件:
dashboard-controller.yaml
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: kubernetes-dashboard namespace: kube-system labels: k8s-app: kubernetes-dashboard kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: selector: matchLabels: k8s-app: kubernetes-dashboard template: metadata: labels: k8s-app: kubernetes-dashboard annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: serviceAccountName: dashboard containers: - name: kubernetes-dashboard image: cokabug/kubernetes-dashboard-amd64:v1.6.0 resources: # keep request = limit to keep this container in guaranteed class limits: cpu: 100m memory: 50Mi requests: cpu: 100m memory: 50Mi ports: - containerPort: 9090 livenessProbe: httpGet: path: / port: 9090 initialDelaySeconds: 30 timeoutSeconds: 30 tolerations: - key: "CriticalAddonsOnly" operator: "Exists" |
dashboard-rbac.yaml
apiVersion: v1 kind: ServiceAccount metadata: name: dashboard namespace: kube-system
---
kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1alpha1 metadata: name: dashboard subjects: - kind: ServiceAccount name: dashboard namespace: kube-system roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io |
dashboard-service.yaml
apiVersion: v1 kind: Service metadata: name: kubernetes-dashboard namespace: kube-system labels: k8s-app: kubernetes-dashboard kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: type: NodePort selector: k8s-app: kubernetes-dashboard ports: - port: 80 targetPort: 9090 |
部署 heapster 插件
到 heapster release 页面 下载最新版本的 heapster
$ wget https://github.com/kubernetes/heapster/archive/v1.3.0.zip
$ unzip v1.3.0.zip
$ mv v1.3.0.zip heapster-1.3.0
$
官方文件目录: heapster-1.3.0/deploy/kube-config/influxdb
$cdheapster-1.3.0/deploy/kube-config/influxdb
$ ls *.yaml
grafana-deployment.yaml heapster-deployment.yaml heapster-service.yaml influxdb-deployment.yaml
grafana-service.yaml heapster-rbac.yaml influxdb-cm.yaml influxdb-service.yaml
- 新加了 heapster-rbac.yaml 和 influxdb-cm.yaml 文件,分别定义 RoleBinding 和 inflxudb 的配置;
已经修改好的 yaml 文件见:heapster。
配置 grafana-deployment
$ diff grafana-deployment.yaml.origgrafana-deployment.yaml
16c16
< image:gcr.io/google_containers/heapster-grafana-amd64:v4.0.2
---
> image:lvanneo/heapster-grafana-amd64:v4.0.2
40,41c40,41
< # value:/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/
< value: /
---
> value:/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/
> #value: /
- 如果后续使用 kube-apiserver 或者 kubectl proxy 访问 grafana dashboard,则必须将 GF_SERVER_ROOT_URL 设置为 /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/,否则后续访问grafana时访问时提示找不到http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/api/dashboards/home 页面;
配置 heapster-deployment
$ diff heapster-deployment.yaml.origheapster-deployment.yaml
13a14
> serviceAccountName: heapster
16c17
< image:gcr.io/google_containers/heapster-amd64:v1.3.0-beta.1
---
> image:lvanneo/heapster-amd64:v1.3.0-beta.1
- 使用的是自定义的、名为 heapster 的 ServiceAccount;
配置 influxdb-deployment
influxdb 官方建议使用命令行或 HTTP API 接口来查询数据库,从 v1.1.0版本开始默认关闭 admin UI,将在后续版本中移除 admin UI 插件。
开启镜像中 admin UI的办法如下:先导出镜像中的 influxdb 配置文件,开启admin 插件后,再将配置文件内容写入 ConfigMap,最后挂载到镜像中,达到覆盖原始配置的目的。相关步骤如下:
注意:无需自己导出、修改和创建ConfigMap,可以直接使用放在 manifests 目录下的 ConfigMap文件。
$# 导出镜像中的 influxdb 配置文件
$ docker run --rm --entrypoint 'cat' -ti lvanneo/heapster-influxdb-amd64:v1.1.1/etc/config.toml>config.toml.orig
$ cp config.toml.orig config.toml
$ # 修改:启用 admin 接口
$ vim config.toml
$ diff config.toml.orig config.toml
35c35
< enabled =false
---
> enabled =true
$ # 将修改后的配置写入到 ConfigMap对象中
$ kubectl create configmap influxdb-config --from-file=config.toml -n kube-system
configmap "influxdb-config" created
$ # 将 ConfigMap 中的配置文件挂载到Pod 中,达到覆盖原始配置的目的
$ diff influxdb-deployment.yaml.orig influxdb-deployment.yaml
16c16
< image:gcr.io/google_containers/heapster-influxdb-amd64:v1.1.1
---
> image:lvanneo/heapster-influxdb-amd64:v1.1.1
19a20,21
> - mountPath: /etc/
> name:influxdb-config
22a25,27
> - name: influxdb-config
> configMap:
> name: influxdb-config
配置 monitoring-influxdb Service
$ diff influxdb-service.yaml.originfluxdb-service.yaml
12a13
> type: NodePort
15a17,20
> name: http
> - port: 8083
> targetPort: 8083
> name: admin
- 定义端口类型为 NodePort,额外增加了 admin 端口映射,用于后续浏览器访问 influxdb 的 admin UI 界面;
执行所有定义文件
$pwd
/root/heapster-1.3.0/deploy/kube-config/influxdb
$ ls *.yaml
grafana-deployment.yaml heapster-deployment.yaml heapster-service.yaml influxdb-deployment.yaml
grafana-service.yaml heapster-rbac.yaml influxdb-cm.yaml influxdb-service.yaml
$ kubectl create -f .
$
检查执行结果
检查 Deployment
$ kubectl get deployments -nkube-system| grep -E'heapster|monitoring'
heapster 1 1 1 1 1m
monitoring-grafana 1 1 1 1 1m
monitoring-influxdb 1 1 1 1 1m
检查 Pods
$ kubectl get pods -n kube-system| grep -E'heapster|monitoring'
heapster-3273315324-tmxbg 1/1 Running 0 11m
monitoring-grafana-2255110352-94lpn 1/1 Running 0 11m
monitoring-influxdb-884893134-3vb6n 1/1 Running 0 11m
检查 kubernets dashboard 界面,看是显示各 Nodes、Pods 的CPU、内存、负载等利用率曲线图;
访问 grafana
- 通过 kube-apiserver 访问:
获取 monitoring-grafana 服务 URL
$ kubectl cluster-info
Kubernetes master is running at https://10.64.3.7:6443
Heapster is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/heapster
KubeDNS is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
monitoring-grafana is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
monitoring-influxdb is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb
$
由于 kube-apiserver 开启了 RBAC 授权,而浏览器访问 kube-apiserver 的时候使用的是匿名证书,所以访问安全端口会导致授权失败。这里需要使用非安全端口访问 kube-apiserver:
浏览器访问 URL: http://10.64.3.7:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana - 通过 kubectl proxy 访问:
创建代理
$ kubectl proxy --address='10.64.3.7' --port=8086 --accept-hosts='^*$'
Starting to serve on 10.64.3.7:8086
浏览器访问 URL:http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
访问 influxdb admin UI
获取 influxdb http 8086 映射的 NodePort
$ kubectl get svc -n kube-system|grep influxdb
monitoring-influxdb 10.254.255.183 <nodes> 8086:8670/TCP,8083:8595/TCP 21m
通过 kube-apiserver 的非安全端口访问influxdb 的 admin UI 界面: http://10.64.3.7:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:8083/
在页面的 “Connection Settings” 的 Host 中输入 node IP,Port 中输入 8086 映射的 nodePort 如上面的 8670,点击 “Save” 即可:
附件:
grafana-deployment.yaml
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: monitoring-grafana namespace: kube-system spec: replicas: 1 template: metadata: labels: task: monitoring k8s-app: grafana spec: containers: - name: grafana image: lvanneo/heapster-grafana-amd64:v4.0.2 ports: - containerPort: 3000 protocol: TCP volumeMounts: - mountPath: /var name: grafana-storage env: - name: INFLUXDB_HOST value: monitoring-influxdb - name: GRAFANA_PORT value: "3000" # The following env variables are required to make Grafana accessible via # the kubernetes api-server proxy. On production clusters, we recommend # removing these env variables, setup auth for grafana, and expose the grafana # service using a LoadBalancer or a public IP. - name: GF_AUTH_BASIC_ENABLED value: "false" - name: GF_AUTH_ANONYMOUS_ENABLED value: "true" - name: GF_AUTH_ANONYMOUS_ORG_ROLE value: Admin - name: GF_SERVER_ROOT_URL # If you're only using the API Server proxy, set this value instead: value: /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/ #value: / volumes: - name: grafana-storage emptyDir: {} |
grafana-service.yaml
apiVersion: v1 kind: Service metadata: labels: # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons) # If you are NOT using this as an addon, you should comment out this line. kubernetes.io/cluster-service: 'true' kubernetes.io/name: monitoring-grafana name: monitoring-grafana namespace: kube-system spec: # In a production setup, we recommend accessing Grafana through an external Loadbalancer # or through a public IP. # type: LoadBalancer # You could also use NodePort to expose the service at a randomly-generated port ports: - port : 80 targetPort: 3000 selector: k8s-app: grafana |
heapster-deployment.yaml
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: heapster namespace: kube-system spec: replicas: 1 template: metadata: labels: task: monitoring k8s-app: heapster spec: serviceAccountName: heapster containers: - name: heapster image: lvanneo/heapster-amd64:v1.3.0-beta.1 imagePullPolicy: IfNotPresent command: - /heapster - --source=kubernetes:https://kubernetes.default - --sink=influxdb:http://monitoring-influxdb:8086 |
heapster-rbac.yaml
apiVersion: v1 kind: ServiceAccount metadata: name: heapster namespace: kube-system
---
kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1alpha1 metadata: name: heapster subjects: - kind: ServiceAccount name: heapster namespace: kube-system roleRef: kind: ClusterRole name: system:heapster apiGroup: rbac.authorization.k8s.io |
heapster-service.yaml
apiVersion: v1 kind: Service metadata: labels: task: monitoring # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons) # If you are NOT using this as an addon, you should comment out this line. kubernetes.io/cluster-service: 'true' kubernetes.io/name: Heapster name: heapster namespace: kube-system spec: ports: - port: 80 targetPort: 8082 selector: k8s-app: heapster |
influxdb-cm.yaml
apiVersion: v1 kind: ConfigMap metadata: name: influxdb-config namespace: kube-system data: config.toml: | reporting-disabled = true bind-address = ":8088" [meta] dir = "/data/meta" retention-autocreate = true logging-enabled = true [data] dir = "/data/data" wal-dir = "/data/wal" query-log-enabled = true cache-max-memory-size = 1073741824 cache-snapshot-memory-size = 26214400 cache-snapshot-write-cold-duration = "10m0s" compact-full-write-cold-duration = "4h0m0s" max-series-per-database = 1000000 max-values-per-tag = 100000 trace-logging-enabled = false [coordinator] write-timeout = "10s" max-concurrent-queries = 0 query-timeout = "0s" log-queries-after = "0s" max-select-point = 0 max-select-series = 0 max-select-buckets = 0 [retention] enabled = true check-interval = "30m0s" [admin] enabled = true bind-address = ":8083" https-enabled = false https-certificate = "/etc/ssl/influxdb.pem" [shard-precreation] enabled = true check-interval = "10m0s" advance-period = "30m0s" [monitor] store-enabled = true store-database = "_internal" store-interval = "10s" [subscriber] enabled = true http-timeout = "30s" insecure-skip-verify = false ca-certs = "" write-concurrency = 40 write-buffer-size = 1000 [http] enabled = true bind-address = ":8086" auth-enabled = false log-enabled = true write-tracing = false pprof-enabled = false https-enabled = false https-certificate = "/etc/ssl/influxdb.pem" https-private-key = "" max-row-limit = 10000 max-connection-limit = 0 shared-secret = "" realm = "InfluxDB" unix-socket-enabled = false bind-socket = "/var/run/influxdb.sock" [[graphite]] enabled = false bind-address = ":2003" database = "graphite" retention-policy = "" protocol = "tcp" batch-size = 5000 batch-pending = 10 batch-timeout = "1s" consistency-level = "one" separator = "." udp-read-buffer = 0 [[collectd]] enabled = false bind-address = ":25826" database = "collectd" retention-policy = "" batch-size = 5000 batch-pending = 10 batch-timeout = "10s" read-buffer = 0 typesdb = "/usr/share/collectd/types.db" [[opentsdb]] enabled = false bind-address = ":4242" database = "opentsdb" retention-policy = "" consistency-level = "one" tls-enabled = false certificate = "/etc/ssl/influxdb.pem" batch-size = 1000 batch-pending = 5 batch-timeout = "1s" log-point-errors = true [[udp]] enabled = false bind-address = ":8089" database = "udp" retention-policy = "" batch-size = 5000 batch-pending = 10 read-buffer = 0 batch-timeout = "1s" precision = "" [continuous_queries] log-enabled = true enabled = true run-interval = "1s" |
influxdb-deployment.yaml
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: monitoring-influxdb namespace: kube-system spec: replicas: 1 template: metadata: labels: task: monitoring k8s-app: influxdb spec: containers: - name: influxdb image: lvanneo/heapster-influxdb-amd64:v1.1.1 volumeMounts: - mountPath: /data name: influxdb-storage - mountPath: /etc/ name: influxdb-config volumes: - name: influxdb-storage emptyDir: {} - name: influxdb-config configMap: name: influxdb-config |
influxdb-service.yaml
apiVersion: v1 kind: Service metadata: labels: task: monitoring # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons) # If you are NOT using this as an addon, you should comment out this line. kubernetes.io/cluster-service: 'true' kubernetes.io/name: monitoring-influxdb name: monitoring-influxdb namespace: kube-system spec: type: NodePort ports: - port: 8086 targetPort: 8086 name: http - port: 8083 targetPort: 8083 name: admin selector: k8s-app: influxdb |
11-部署EFK插件
部署 EFK 插件
官方文件目录:kubernetes/cluster/addons/fluentd-elasticsearch
$ ls*.yaml
es-controller.yaml es-rbac.yaml es-service.yaml fluentd-es-ds.yaml kibana-controller.yaml kibana-service.yaml fluentd-es-rbac.yaml
- 新加了 es-rbac.yaml 和 fluentd-es-rbac.yaml 文件,定义了 elasticsearch 和 fluentd 使用的 Role 和 RoleBinding;
已经修改好的 yaml 文件见:EFK。
配置 es-controller.yaml
$ diff es-controller.yaml.origes-controller.yaml
22a23
> serviceAccountName: elasticsearch
24c25
< - image: gcr.io/google_containers/elasticsearch:v2.4.1-2
---
> - image: onlyerich/elasticsearch:v2.4.1-2
配置 es-service.yaml
无需配置;
配置 fluentd-es-ds.yaml
$ diff fluentd-es-ds.yaml.origfluentd-es-ds.yaml
23a24
> serviceAccountName: fluentd
26c27
< image:gcr.io/google_containers/fluentd-elasticsearch:1.22
---
> image:onlyerich/fluentd-elasticsearch:1.22
配置 kibana-controller.yaml
$ diff kibana-controller.yaml.origkibana-controller.yaml
22c22
< image:gcr.io/google_containers/kibana:v4.6.1-1
---
> image: onlyerich/kibana:v4.6.1-1
给 Node 设置标签
DaemonSet fluentd-es-v1.22 只会调度到设置了标签 beta.kubernetes.io/fluentd-ds-ready=true 的 Node,需要在期望运行 fluentd的 Node 上设置该标签;
$ kubectl get nodes
NAME STATUS AGE VERSION
10.64.3.7 Ready 1d v1.6.2
$ kubectl label nodes 10.64.3.7beta.kubernetes.io/fluentd-ds-ready=true
node "10.64.3.7" labeled
执行定义文件
$pwd
/root/kubernetes/cluster/addons/fluentd-elasticsearch
$ ls *.yaml
es-controller.yaml es-rbac.yaml es-service.yaml fluentd-es-ds.yaml kibana-controller.yaml kibana-service.yaml fluentd-es-rbac.yaml
$ kubectl create -f .
$
检查执行结果
$ kubectl get deployment -nkube-system|grep kibana
kibana-logging 1 1 1 1 2m
$ kubectl get pods -n kube-system|grep -E 'elasticsearch|fluentd|kibana'
elasticsearch-logging-v1-kwc9w 1/1 Running 0 4m
elasticsearch-logging-v1-ws9mk 1/1 Running 0 4m
fluentd-es-v1.22-g76x0 1/1 Running 0 4m
kibana-logging-324921636-ph7sn 1/1 Running 0 4m
$ kubectl get service -n kube-system|grep-E'elasticsearch|kibana'
elasticsearch-logging 10.254.128.156 <none> 9200/TCP 3m
kibana-logging 10.254.88.109 <none> 5601/TCP 3m
kibana Pod 第一次启动时会用**较长时间(10-20分钟)**来优化和 Cache状态页面,可以 tailf 该 Pod 的日志观察进度:
$ kubectl logskibana-logging-324921636-ph7sn -n kube-system-f
ELASTICSEARCH_URL=http://elasticsearch-logging:9200
server.basePath:/api/v1/proxy/namespaces/kube-system/services/kibana-logging
{"type":"log","@timestamp":"2017-04-08T09:30:30Z","tags":["info","optimize"],"pid":7,"message":"Optimizingand caching bundles for kibana and statusPage. This may take a fewminutes"}
{"type":"log","@timestamp":"2017-04-08T09:44:01Z","tags":["info","optimize"],"pid":7,"message":"Optimizationof bundles for kibana and statusPage complete in 811.00 seconds"}
{"type":"log","@timestamp":"2017-04-08T09:44:02Z","tags":["status","plugin:kibana@1.0.0","info"],"pid":7,"state":"green","message":"Statuschanged from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
访问 kibana
- 通过 kube-apiserver 访问:
获取 monitoring-grafana 服务 URL
$ kubectl cluster-info
Kubernetes master is running at https://10.64.3.7:6443
Elasticsearch is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/elasticsearch-logging
Heapster is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/heapster
Kibana is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kibana-logging
KubeDNS is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
monitoring-grafana is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
monitoring-influxdb is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb
由于 kube-apiserver 开启了 RBAC 授权,而浏览器访问 kube-apiserver 的时候使用的是匿名证书,所以访问安全端口会导致授权失败。这里需要使用非安全端口访问 kube-apiserver:
浏览器访问 URL: http://10.64.3.7:8080/api/v1/proxy/namespaces/kube-system/services/kibana-logging - 通过 kubectl proxy 访问:
创建代理
$ kubectl proxy --address='10.64.3.7' --port=8086 --accept-hosts='^*$'
Starting to serve on 10.64.3.7:8086
浏览器访问 URL:http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/kibana-logging
在 Settings -> Indices页面创建一个 index(相当于 mysql 中的一个 database),选中 Index contains time-based events,使用默认的 logstash-* pattern,点击 Create ;
创建Index后,稍等几分钟就可以在 Discover 菜单下看到 ElasticSearchlogging 中汇聚的日志;
附件:
es-controller.yaml
apiVersion: v1 kind: ReplicationController metadata: name: elasticsearch-logging-v1 namespace: kube-system labels: k8s-app: elasticsearch-logging version: v1 kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: replicas: 2 selector: k8s-app: elasticsearch-logging version: v1 template: metadata: labels: k8s-app: elasticsearch-logging version: v1 kubernetes.io/cluster-service: "true" spec: serviceAccountName: elasticsearch containers: - image: onlyerich/elasticsearch:v2.4.1-2 name: elasticsearch-logging resources: # need more cpu upon initialization, therefore burstable class limits: cpu: 1000m requests: cpu: 100m ports: - containerPort: 9200 name: db protocol: TCP - containerPort: 9300 name: transport protocol: TCP volumeMounts: - name: es-persistent-storage mountPath: /data env: - name: "NAMESPACE" valueFrom: fieldRef: fieldPath: metadata.namespace volumes: - name: es-persistent-storage emptyDir: {} |
es-rbac.yaml
apiVersion: v1 kind: ServiceAccount metadata: name: elasticsearch namespace: kube-system
---
kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1alpha1 metadata: name: elasticsearch subjects: - kind: ServiceAccount name: elasticsearch namespace: kube-system roleRef: kind: ClusterRole name: view apiGroup: rbac.authorization.k8s.io |
es-service.yaml
apiVersion: v1 kind: Service metadata: name: elasticsearch-logging namespace: kube-system labels: k8s-app: elasticsearch-logging kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "Elasticsearch" spec: ports: - port: 9200 protocol: TCP targetPort: db selector: k8s-app: elasticsearch-logging |
fluentd-es-ds.yaml
apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: fluentd-es-v1.22 namespace: kube-system labels: k8s-app: fluentd-es kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile version: v1.22 spec: template: metadata: labels: k8s-app: fluentd-es kubernetes.io/cluster-service: "true" version: v1.22 # This annotation ensures that fluentd does not get evicted if the node # supports critical pod annotation based priority scheme. # Note that this does not guarantee admission on the nodes (#40573). annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: serviceAccountName: fluentd containers: - name: fluentd-es image: onlyerich/fluentd-elasticsearch:1.22 command: - '/bin/sh' - '-c' - '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log' resources: limits: memory: 200Mi requests: cpu: 100m memory: 200Mi volumeMounts: - name: varlog mountPath: /var/log - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true nodeSelector: beta.kubernetes.io/fluentd-ds-ready: "true" tolerations: - key : "node.alpha.kubernetes.io/ismaster" effect: "NoSchedule" terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers |
fluentd-es-rbac.yaml
apiVersion: v1 kind: ServiceAccount metadata: name: fluentd namespace: kube-system
---
kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1alpha1 metadata: name: fluentd subjects: - kind: ServiceAccount name: fluentd namespace: kube-system roleRef: kind: ClusterRole name: view apiGroup: rbac.authorization.k8s.io |
kibana-controller.yaml
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: kibana-logging namespace: kube-system labels: k8s-app: kibana-logging kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: replicas: 1 selector: matchLabels: k8s-app: kibana-logging template: metadata: labels: k8s-app: kibana-logging spec: containers: - name: kibana-logging image: onlyerich/kibana:v4.6.1-1 resources: # keep request = limit to keep this container in guaranteed class limits: cpu: 100m requests: cpu: 100m env: - name: "ELASTICSEARCH_URL" value: "http://elasticsearch-logging:9200" - name: "KIBANA_BASE_URL" value: "/api/v1/proxy/namespaces/kube-system/services/kibana-logging" ports: - containerPort: 5601 name: ui protocol: TCP |
kibana-service.yaml
apiVersion: v1 kind: Service metadata: name: kibana-logging namespace: kube-system labels: k8s-app: kibana-logging kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "Kibana" spec: ports: - port: 5601 protocol: TCP targetPort: ui selector: k8s-app: kibana-logging |
12-部署Docker-Registry
部署私有 docker registry
注意:本文档介绍使用 docker 官方的 registry v2镜像部署私有仓库的步骤,你也可以部署 Harbor 私有仓库(部署Harbor 私有仓库)。
本文档讲解部署一个 TLS 加密、HTTP Basic 认证、用 ceph rgw做后端存储的私有 docker registry 步骤,如果使用其它类型的后端存储,则可以从 “创建 docker registry” 节开始;
示例两台机器 IP 如下:
- ceph rgw: 10.64.3.9
- docker registry: 10.64.3.7
部署 ceph RGW 节点
$ ceph-deploy rgw create 10.64.3.9# rgw 默认监听7480端口
$
创建测试账号 demo
$ radosgw-admin user create --uid=demo--display-name="cephrgw demo user"
$
创建 demo 账号的子账号 swift
当前 registry 只支持使用 swift 协议访问 ceph rgw 存储,暂时不支持s3 协议;
$ radosgw-admin subuser create --uid demo--subuser=demo:swift --access=full --secret=secretkey --key-type=swift
$
创建 demo:swift 子账号的 sercret key
$ radosgw-admin key create--subuser=demo:swift --key-type=swift --gen-secret
{
"user_id":"demo",
"display_name":"cephrgw demo user",
"email":"",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers": [
{
"id":"demo:swift",
"permissions":"full-control"
}
],
"keys": [
{
"user":"demo",
"access_key":"5Y1B1SIJ2YHKEHO5U36B",
"secret_key":"nrIvtPqUj7pUlccLYPuR3ntVzIa50DToIpe7xFjT"
}
],
"swift_keys": [
{
"user":"demo:swift",
"secret_key":"aCgVTx3Gfz1dBiFS4NfjIRmvT0sgpHDP6aa0Yfrh"
}
],
"caps": [],
"op_mask":"read,write, delete",
"default_placement":"",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
},
"temp_url_keys": []
}
- aCgVTx3Gfz1dBiFS4NfjIRmvT0sgpHDP6aa0Yfrh 为子账号 demo:swift 的 secret key;
创建 docker registry
创建 registry 使用的 TLS 证书
$ mdir -p registry/{auth,certs}
$ cat registry-csr.json
{
"CN":"registry",
"hosts": [
"127.0.0.1",
"10.64.3.7"
],
"key": {
"algo":"rsa",
"size": 2048
},
"names": [
{
"C":"CN",
"ST":"BeiJing",
"L":"BeiJing",
"O":"k8s",
"OU":"System"
}
]
}
$ cfssl gencert -ca=/etc/kubernetes/ssl/ca.pem \
-ca-key=/etc/kubernetes/ssl/ca-key.pem \
-config=/etc/kubernetes/ssl/ca-config.json \
-profile=kubernetes registry-csr.json| cfssljson -bare registry
$ cp registry.pem registry-key.pem registry/certs
$
- 这里复用以前创建的 CA 证书和秘钥文件;
- hosts 字段指定 registry 的 NodeIP;
创建 HTTP Baisc 认证文件
$ docker run --entrypoint htpasswdregistry:2 -Bbn foo foo123 > auth/htpasswd
$ catauth/htpasswd
foo:$2y$05$I60z69MdluAQ8i1Ka3x3Neb332yz1ioow2C4oroZSOE0fqPogAmZm
配置 registry 参数
$ exportRGW_AUTH_URL="http://10.64.3.9:7480/auth/v1"
$ export RGW_USER="demo:swift"
$ export RGW_SECRET_KEY="aCgVTx3Gfz1dBiFS4NfjIRmvT0sgpHDP6aa0Yfrh"
$ cat > config.yml<< EOF
# https://docs.docker.com/registry/configuration/#list-of-configuration-options
version: 0.1
log:
level: info
fromatter: text
fields:
service: registry
storage:
cache:
blobdescriptor: inmemory
delete:
enabled: true
swift:
authurl: ${RGW_AUTH_URL}
username: ${RGW_USER}
password: ${RGW_SECRET_KEY}
container: registry
auth:
htpasswd:
realm: basic-realm
path: /auth/htpasswd
http:
addr: 0.0.0.0:8000
headers:
X-Content-Type-Options:[nosniff]
tls:
certificate:/certs/registry.pem
key: /certs/registry-key.pem
health:
storagedriver:
enabled: true
interval: 10s
threshold: 3
EOF
- storage.swift 指定后端使用 swfit 接口协议的存储,这里配置的是 ceph rgw 存储参数;
- auth.htpasswd 指定了 HTTP Basic 认证的 token 文件路径;
- http.tls 指定了 registry http 服务器的证书和秘钥文件路径;
创建 docker registry
$ docker run -d -p 8000:8000\
-v $(pwd)/registry/auth/:/auth\
-v $(pwd)/registry/certs:/certs\
-v $(pwd)/config.yml:/etc/docker/registry/config.yml\
--name registry registry:2
- 执行该 docker run 命令的机器 IP 为 10.64.3.7;
向 registry push image
将签署 registry 证书的 CA证书拷贝到 /etc/docker/certs.d/10.64.3.7:8000 目录下
$ sudo mkdir -p/etc/docker/certs.d/10.64.3.7:8000
$ sudo cp /etc/kubernetes/ssl/ca.pem/etc/docker/certs.d/10.64.3.7:8000/ca.crt
$
登陆私有 registry
$ docker login 10.64.3.7:8000
Username: foo
Password:
Login Succeeded
登陆信息被写入 ~/.docker/config.json 文件
$ cat ~/.docker/config.json
{
"auths": {
"10.64.3.7:8000": {
"auth":"Zm9vOmZvbzEyMw=="
}
}
}
将本地的 image 打上私有 registry 的 tag
$ docker tagdocker.io/kubernetes/pause 10.64.3.7:8000/zhangjun3/pause
$ docker images |greppause
docker.io/kubernetes/pause latest f9d5de079539 2 years ago 239.8kB
10.64.3.7:8000/zhangjun3/pause latest f9d5de079539 2 years ago 239.8 kB
将 image push 到私有 registry
$ docker push10.64.3.7:8000/zhangjun3/pause
The push refers to a repository[10.64.3.7:8000/zhangjun3/pause]
5f70bf18a086: Pushed
e16a89738269: Pushed
latest: digest:sha256:9a6b437e896acad3f5a2a8084625fdd4177b2e7124ee943af642259f2f283359 size:916
查看 ceph 上是否已经有 push 的 pause 容器文件
$ radoslspools
rbd
.rgw.root
default.rgw.control
default.rgw.data.root
default.rgw.gc
default.rgw.log
default.rgw.users.uid
default.rgw.users.keys
default.rgw.users.swift
default.rgw.buckets.index
default.rgw.buckets.data
$ rados --pooldefault.rgw.buckets.data ls|greppause
9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_layers/sha256/f9d5de0795395db6c50cb1ac82ebed1bd8eb3eefcebb1aa724e01239594e937b/link
9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_layers/sha256/f72a00a23f01987b42cb26f259582bb33502bdb0fcf5011e03c60577c4284845/link
9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_layers/sha256/a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4/link
9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_manifests/tags/latest/current/link
9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_manifests/tags/latest/index/sha256/9a6b437e896acad3f5a2a8084625fdd4177b2e7124ee943af642259f2f283359/link
9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_manifests/revisions/sha256/9a6b437e896acad3f5a2a8084625fdd4177b2e7124ee943af642259f2f283359/link
私有 registry 的运维操作
查询私有镜像中的 images
$ curl --user zhangjun3:xxx --cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crthttps://10.64.3.7:8000/v2/_catalog
{"repositories":["library/redis","zhangjun3/busybox","zhangjun3/pause","zhangjun3/pause2"]}
查询某个镜像的 tags 列表
$ curl --user zhangjun3:xxx --cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crthttps://10.64.3.7:8000/v2/zhangjun3/busybox/tags/list
{"name":"zhangjun3/busybox","tags":["latest"]}
获取 image 或 layer 的 digest
向 v2/<repoName>/manifests/<tagName> 发 GET 请求,从响应的头部 Docker-Content-Digest 获取 image digest,从响应的body 的 fsLayers.blobSum 中获取 layDigests;
注意,必须包含请求头:Accept:application/vnd.docker.distribution.manifest.v2+json:
$ curl -v -H "Accept:application/vnd.docker.distribution.manifest.v2+json" --user zhangjun3:xxx --cacert/etc/docker/certs.d/10.64.3.7\:8000/ca.crt https://10.64.3.7:8000/v2/zhangjun3/busybox/manifests/latest
> GET /v2/zhangjun3/busybox/manifests/latest HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 10.64.3.7:8000
> Accept:application/vnd.docker.distribution.manifest.v2+json
>
< HTTP/1.1 200 OK
< Content-Length: 527
< Content-Type:application/vnd.docker.distribution.manifest.v2+json
< Docker-Content-Digest:sha256:68effe31a4ae8312e47f54bec52d1fc925908009ce7e6f734e1b54a4169081c5
< Docker-Distribution-Api-Version:registry/2.0
< Etag:"sha256:68effe31a4ae8312e47f54bec52d1fc925908009ce7e6f734e1b54a4169081c5"
< X-Content-Type-Options: nosniff
< Date: Tue, 21 Mar 2017 15:19:42GMT
<
{
"schemaVersion": 2,
"mediaType":"application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType":"application/vnd.docker.container.image.v1+json",
"size": 1465,
"digest":"sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff"
},
"layers": [
{
"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 701102,
"digest":"sha256:04176c8b224aa0eb9942af765f66dae866f436e75acef028fe44b8a98e045515"
}
]
}
删除 image
向 /v2/<name>/manifests/<reference> 发送 DELETE 请求,reference为上一步返回的 Docker-Content-Digest 字段内容:
$ curl -X DELETE --user zhangjun3:xxx --cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crthttps://10.64.3.7:8000/v2/zhangjun3/busybox/manifests/sha256:68effe31a4ae8312e47f54bec52d1fc925908009ce7e6f734e1b54a4169081c5
$
删除 layer
向 /v2/<name>/blobs/<digest>发送 DELETE 请求,其中 digest是上一步返回的 fsLayers.blobSum 字段内容:
$ curl -X DELETE --user zhangjun3:xxx --cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crthttps://10.64.3.7:8000/v2/zhangjun3/busybox/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
$ curl -X DELETE --cacert/etc/docker/certs.d/10.64.3.7\:8000/ca.crt https://10.64.3.7:8000/v2/zhangjun3/busybox/blobs/sha256:04176c8b224aa0eb9942af765f66dae866f436e75acef028fe44b8a98e045515
附件:
config.yml 我挂载的本地路径
version: 0.1 log: level: info fromatter: text fields: service: registry
storage: cache: blobdescriptor: inmemory delete: enabled: true filesystem: rootdirectory: /var/lib/registry
auth: htpasswd: realm: basic-realm path: /auth/htpasswd
http: addr: 0.0.0.0:8000 headers: X-Content-Type-Options: [nosniff] tls: certificate: /certs/registry.pem key: /certs/registry-key.pem
health: storagedriver: enabled: true interval: 10s threshold: 3 |
13-部署harbor私有仓库
部署 harbor 私有仓库
本文档介绍使用 docker-compose 部署 harbor私有仓库的步骤,你也可以使用 docker 官方的 registry 镜像部署私有仓库(部署Docker Registry)。
使用的变量
本文档用到的变量定义如下:
$exportNODE_IP=10.64.3.7# 当前部署harbor 的节点 IP
$
下载文件
从 docker compose 发布页面下载最新的 docker-compose 二进制文件
$ wgethttps://github.com/docker/compose/releases/download/1.12.0/docker-compose-Linux-x86_64
$ mv ~/docker-compose-Linux-x86_64/root/local/bin/docker-compose
$ chmod a+x /root/local/bin/docker-compose
$ export PATH=/root/local/bin:$PATH
$
从 harbor 发布页面下载最新的 harbor 离线安装包
$ wget --continuehttps://github.com/vmware/harbor/releases/download/v1.1.0/harbor-offline-installer-v1.1.0.tgz
$ tar -xzvf harbor-offline-installer-v1.1.0.tgz
$ cd harbor
$
导入 docker images
导入离线安装包中 harbor 相关的 docker images:
$ docker load -i harbor.v1.1.0.tar.gz
$
创建 harbor nginx 服务器使用的 TLS 证书
创建 harbor 证书签名请求:
$ cat>harbor-csr.json<<EOF
{
"CN": "harbor",
"hosts": [
"127.0.0.1",
"$NODE_IP"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
- hosts 字段指定授权使用该证书的当前部署节点 IP,如果后续使用域名访问 harbor则还需要添加域名;
生成 harbor 证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \
-ca-key=/etc/kubernetes/ssl/ca-key.pem\
-config=/etc/kubernetes/ssl/ca-config.json \
-profile=kubernetes harbor-csr.json | cfssljson -bare harbor
$ ls harbor*
harbor.csr harbor-csr.json harbor-key.pem harbor.pem
$ sudo mkdir -p /etc/harbor/ssl
$ sudo mv harbor*.pem /etc/harbor/ssl
$ rm harbor.csr harbor-csr.json
修改 harbor.cfg 文件
$ diff harbor.cfg.origharbor.cfg
5c5
< hostname =reg.mydomain.com
---
> hostname = 10.64.3.7
9c9
< ui_url_protocol = http
---
> ui_url_protocol =https
24,25c24,25
< ssl_cert =/data/cert/server.crt
< ssl_cert_key =/data/cert/server.key
---
> ssl_cert =/etc/harbor/ssl/harbor.pem
> ssl_cert_key =/etc/harbor/ssl/harbor-key.pem
加载和启动 harbor 镜像
mkdir -p /data
$ ./install.sh
[Step 0]: checking installation environment ...
Note: docker version: 17.04.0
Note: docker-compose version: 1.12.0
[Step 1]: loading Harbor images ...
Loaded image: vmware/harbor-adminserver:v1.1.0
Loaded image: vmware/harbor-ui:v1.1.0
Loaded image: vmware/harbor-log:v1.1.0
Loaded image: vmware/harbor-jobservice:v1.1.0
Loaded image: vmware/registry:photon-2.6.0
Loaded image: vmware/harbor-notary-db:mariadb-10.1.10
Loaded image: vmware/harbor-db:v1.1.0
Loaded image: vmware/nginx:1.11.5-patched
Loaded image: photon:1.0
Loaded image: vmware/notary-photon:server-0.5.0
Loaded image: vmware/notary-photon:signer-0.5.0
[Step 2]: preparing environment ...
Generated and saved secret to file: /data/secretkey
Generated configuration file: ./common/config/nginx/nginx.conf
Generated configuration file: ./common/config/adminserver/env
Generated configuration file: ./common/config/ui/env
Generated configuration file:./common/config/registry/config.yml
Generated configuration file: ./common/config/db/env
Generated configuration file: ./common/config/jobservice/env
Generated configuration file:./common/config/jobservice/app.conf
Generated configuration file: ./common/config/ui/app.conf
Generated certificate, key file: ./common/config/ui/private_key.pem, cert file:./common/config/registry/root.crt
The configuration files are ready, please use docker-compose to start theservice.
[Step 3]: checking existing instance of Harbor ...
[Step 4]: starting Harbor...
Creating network "harbor_harbor" with the default driver
Creating harbor-log
Creating registry
Creating harbor-adminserver
Creating harbor-db
Creating harbor-ui
Creating harbor-jobservice
Creating nginx
✔ ----Harbor has been installed and startedsuccessfully.----
Now you should be able to visit theadmin portal athttps://10.64.3.7.
For more details, please visit https://github.com/vmware/harbor.
访问管理界面
浏览器访问 https://${NODE_IP},示例的是 https://10.64.3.7
用账号 admin 和 harbor.cfg配置文件中的默认密码 Harbor12345 登陆系统:
harbor 运行时产生的文件、目录
$# 日志目录
$ ls /var/log/harbor/2017-04-19/
adminserver.log jobservice.log mysql.log proxy.log registry.log ui.log
$ # 数据目录,包括数据库、镜像仓库
$ ls /data/
ca_download config database job_logs registry secretkey
docker 客户端登陆
将签署 harbor 证书的 CA证书拷贝到 /etc/docker/certs.d/10.64.3.7 目录下
$ sudo mkdir -p /etc/docker/certs.d/10.64.3.7
$ sudo cp /etc/kubernetes/ssl/ca.pem/etc/docker/certs.d/10.64.3.7/ca.crt
$
登陆 harbor
$ docker login 10.64.3.7
Username: admin
Password:
认证信息自动保存到 ~/.docker/config.json 文件。
其它操作
下列操作的工作目录均为 解压离线安装文件后 生成的 harbor 目录。
$# 停止 harbor
$ docker-compose down -v
$ # 修改配置
$ vim harbor.cfg
$ # 更修改的配置更新到docker-compose.yml 文件
[root@tjwq01-sys-bs003007 harbor]# ./prepare
Clearing the configuration file: ./common/config/ui/app.conf
Clearing the configuration file: ./common/config/ui/env
Clearing the configuration file:./common/config/ui/private_key.pem
Clearing the configuration file: ./common/config/db/env
Clearing the configuration file:./common/config/registry/root.crt
Clearing the configuration file:./common/config/registry/config.yml
Clearing the configuration file:./common/config/jobservice/app.conf
Clearing the configuration file: ./common/config/jobservice/env
Clearing the configuration file:./common/config/nginx/cert/admin.pem
Clearing the configuration file:./common/config/nginx/cert/admin-key.pem
Clearing the configuration file: ./common/config/nginx/nginx.conf
Clearing the configuration file: ./common/config/adminserver/env
loaded secret from file: /data/secretkey
Generated configuration file: ./common/config/nginx/nginx.conf
Generated configuration file: ./common/config/adminserver/env
Generated configuration file: ./common/config/ui/env
Generated configuration file:./common/config/registry/config.yml
Generated configuration file: ./common/config/db/env
Generated configuration file: ./common/config/jobservice/env
Generated configuration file:./common/config/jobservice/app.conf
Generated configuration file: ./common/config/ui/app.conf
Generated certificate, key file: ./common/config/ui/private_key.pem, cert file:./common/config/registry/root.crt
The configuration files are ready, please use docker-compose to start theservice.
$ # 启动 harbor
[root@tjwq01-sys-bs003007 harbor]# docker-compose up -d
附件:
harbor.cfg
hostname = 192.168.1.206 ui_url_protocol = https db_password = root123 max_job_workers = 3 customize_crt = on ssl_cert = /etc/harbor/ssl/harbor.pem ssl_cert_key = /etc/harbor/ssl/harbor-key.pem secretkey_path = /data admiral_url = NA email_identity = email_server = smtp.mydomain.com email_server_port = 25 email_username = sample_admin@mydomain.com email_password = abc email_from = admin <sample_admin@mydomain.com> email_ssl = false harbor_admin_password = Harbor12345 auth_mode = db_auth ldap_url = ldaps://ldap.mydomain.com ldap_basedn = ou=people,dc=mydomain,dc=com ldap_uid = uid ldap_scope = 3 ldap_timeout = 5 self_registration = on token_expiration = 30 project_creation_restriction = everyone verify_remote_cert = on |
14-清理集群
清理集群
清理 Node 节点
停相关进程:
$ sudo systemctl stop kubelet kube-proxy flannelddocker
$
清理文件:
$# umount kubelet 挂载的目录
$ mount | grep'/var/lib/kubelet'| awk'{print $3}'|xargs sudo umount
$ # 删除 kubelet 工作目录
$ sudo rm -rf /var/lib/kubelet
$ # 删除 docker 工作目录
$ sudo rm -rf /var/lib/docker
$ # 删除 flanneld 写入的网络配置文件
$ sudo rm -rf /var/run/flannel/
$ # 删除 docker 的一些运行文件
$ sudo rm -rf /var/run/docker/
$ # 删除 systemd unit 文件
$ sudo rm -rf /etc/systemd/system/{kubelet,docker,flanneld}.service
$ # 删除程序文件
$ sudo rm -rf /root/local/bin/{kubelet,docker,flanneld}
$ # 删除证书文件
$ sudo rm -rf /etc/flanneld/ssl /etc/kubernetes/ssl
$
清理 kube-proxy 和 docker 创建的 iptables:
$ sudo iptables -F&& sudo iptables -X&& sudo iptables -F -t nat&& sudo iptables -X -t nat
$
删除 flanneld 和 docker 创建的网桥:
$ ip link del flannel.1
$ ip link del docker0
$
清理 Master 节点
停相关进程:
$ sudo systemctl stop kube-apiserverkube-controller-manager kube-scheduler
$
清理文件:
$# 删除 kube-apiserver 工作目录
$ sudo rm -rf /var/run/kubernetes
$ # 删除 systemd unit 文件
$ sudo rm -rf/etc/systemd/system/{kube-apiserver,kube-controller-manager,kube-scheduler}.service
$ # 删除程序文件
$ sudo rm -rf/root/local/bin/{kube-apiserver,kube-controller-manager,kube-scheduler}
$ # 删除证书文件
$ sudo rm -rf /etc/flanneld/ssl /etc/kubernetes/ssl
$
清理 etcd 集群
停相关进程:
$ sudo systemctl stop etcd
$
清理文件:
$# 删除 etcd 的工作目录和数据目录
$ sudo rm -rf /var/lib/etcd
$ # 删除 systemd unit 文件
$ sudo rm -rf /etc/systemd/system/etcd.service
$ # 删除程序文件
$ sudo rm -rf /root/local/bin/etcd
$ # 删除 TLS 证书文件
$ sudo rm -rf /etc/etcd/ssl/*
更多推荐
所有评论(0)