K8S集群搭建笔记
K8S集群搭建笔记1. Master的创建及配置1.1 环境准备1.2 初始化主节点1.2.1 修改主节点配置信息1.2.2 初始化主节点 kubeadm init1.2.3 配置kubectl1.2.4 检查master配置是否成功1.2.5 kubeadm-init.log文件1.2.5 查看master的token信息1.2.6 重新创建token1.2.7 获得--discovery-to
1. Master的创建及配置
1.1 环境准备
切换到root用户设置准备好各项环境。
- 安装docker
# 第一步卸载旧版
sudo apt-get remove docker docker-engine docker.io containerd runc
# 更新准备
sudo apt-get update
# 允许apt通过https使用repository安装软件包
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
# 添加Docker官方GPG key(采用阿里云版)
sudo curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | apt-key add -
# 验证key的指纹
sudo apt-key fingerprint 0EBFCD88
# 添加稳定版repository
sudo add-apt-repository \
"deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get update
# 安装指定版本的docker ce(apt-cache madison docker-ce可以查看有哪些版本, apt-cache madison docker-ce,推荐17.03.0~ce-0~ubuntu-xenial)
sudo apt-get install docker-ce=17.03.0~ce-0~ubuntu-xenial
# 将非root用户加入docker组,以允许免sudo执行docker
sudo gpasswd -a 用户名 docker
# 重启服务并刷新docker组成员
sudo service docker restart
newgrp - docker
- 安装kubeadm
- 第1步:关闭防火墙和关闭swap
ufw disable
swapoff -a
# 注释 swap 开头的行 避免开机启动
vi /etc/fstab
- 第2步:配置软件源(注意,下面阿里云支持的是16.04 的 xenial)
apt-get update && apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat << EOF >/etc/apt/sources.list.d/kubernetes.list
> deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
> EOF
- 第3步:安装 kubeadm,kubelet,kubectl
apt-get update
apt-get install -y kubelet kubeadm kubectl
- 第4步:设置 kubelet 自启动,并启动 kubelet
systemctl enable kubelet && systemctl start kubelet
1.2 初始化主节点
1.2.1 修改主节点配置信息
- (1) 先导出配置文件
kubeadm config print init-defaults --kubeconfig ClusterConfiguration > kubeadm.yml
- (2) 再修改配置文件kubeadm.yml的部分信息
apiVersion: kubeadm.k8s.io/v1beta1
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# 修改为主节点 IP
advertiseAddress: 192.168.141.130
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: kubernetes-master
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: ""
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
# 国内不能访问 Google,修改为阿里云
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
# 修改版本号
kubernetesVersion: v1.14.1
networking:
dnsDomain: cluster.local
# 配置成 Calico 的默认网段
podSubnet: "192.168.0.0/16"
serviceSubnet: 10.96.0.0/12
scheduler: {}
---
# 开启 IPVS 模式
(以下适合 k8s 1.19前版本)
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
featureGates:
SupportIPVSProxyMode: true
mode: ipvs
(以下适合 k8s 1.20之后版本)
...
kubeProxy:
config:
mode: ipvs
...
注意:kubernetes对于IPVS模式的启用(https://www.cnblogs.com/zhangsi-lzq/p/14279997.html)
- 在1.19版本之前,kubeadm部署方式启用ipvs模式时,初始化配置文件需要添加以下内容:
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
featureGates:
SupportIPVSProxyMode: true
mode: ipvs
- 在1.20版本中,使用kubeadm进行集群初始化时,虽然可以正常部署,但是查看pod情况的时候可以看到kube-proxy无法运行成功,报错部分内容如下:
#查看日志信息
]# kubectl logs kube-proxy-l9twb -n kube-system
F0114 12:58:34.042769 1 server.go:488] failed complete: unrecognized feature gate: SupportIPVSProxyMode
goroutine 1 [running]:
k8s.io/kubernetes/vendor/k8s.io/klog/v2.stacks(0xc00000e001, 0xc0004b6000, 0x6e, 0xc0)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1026 +0xb9
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).output(0x29b65c0, 0xc000000003, 0x0, 0x0, 0xc0003d8230, 0x28edbc9, 0x9, 0x1e8, 0x0)
删除configmap中kube-proxy相关内容
kube-proxy的配置文件是通过configmap方式挂载到容器中的,因此我们只需要对应修改configmap中的配置内容,就可以将无效字段删除
# kubectl get cm -n kube-system
NAME DATA AGE
coredns 1 5h18m
extension-apiserver-authentication 6 5h18m
kube-proxy 2 5h18m
kube-root-ca.crt 1 5h18m
kubeadm-config 2 5h18m
kubelet-config-1.20 1 5h18m
# kubectl edit cm kube-proxy -n kube-system
#在编辑模式中找到以下字段,删除后保存退出
featureGates:
SupportIPVSProxyMode: true
然后将删除所有kube-proxy进行重启,查看pod运行情况
- (3) 拉取镜像
kubeadm config images pull --config kubeadm.yml
1.2.2 初始化主节点 kubeadm init
该命令指定了初始化时需要使用的配置文件,其中添加 --experimental-upload-certs 参数可以在后续执行加入节点时自动分发证书文件,追加的 tee kubeadm-init.log 用以输出日志,具体如下:
kubeadm init --config=kubeadm.yml --upload-certs | tee kubeadm-init.log
中途失败或是想修改配置可以使用 kubeadm reset 命令重置配置,再做kubeadm init初始化操作即可;各Slave节点也要reset并重新执行join
重新初始化时,注意先删除$HOME/.kube目录,否则会出现错误:Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “kubernetes”)
成功后如下:
[init] Using Kubernetes version: v1.14.1
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.141.130]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kubernetes-master localhost] and IPs [192.168.141.130 127.0.0.1 ::1]
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kubernetes-master localhost] and IPs [192.168.141.130 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 20.003326 seconds
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in ConfigMap "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
2cd5b86c4905c54d68cc7dfecc2bf87195e9d5d90b4fff9832d9b22fc5e73f96
[mark-control-plane] Marking the node kubernetes-master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kubernetes-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
# 后面子节点加入需要如下命令
kubeadm join 192.168.141.130:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:cab7c86212535adde6b8d1c7415e81847715cfc8629bb1d270b601744d662515
1.2.3 配置kubectl
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
# 非 ROOT 用户执行
chown $(id -u):$(id -g) $HOME/.kube/config
1.2.4 检查master配置是否成功
kubectl get node
# 能够打印出节点信息即表示成功
NAME STATUS ROLES AGE VERSION
kubernetes-master NotReady master 8m40s v1.14.1
1.2.5 kubeadm-init.log文件
kubeadm init指令执行后产生的log文件,可以查看后续slave节点添加时的token信息;
1.2.5 查看master的token信息
kubeadm token list
OKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
8ewj1p.9r9hcjoqgajrj4gi 23h 2018-06-12T02:51:28Z authentication, The default bootstrap system:
signing token generated by bootstrappers:
'kubeadm init'. kubeadm:
default-node-token
1.2.6 重新创建token
kubeadm token create
5didvk.d09sbcov8ph2amjw
1.2.7 获得–discovery-token-ca-cert-hash
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \
openssl dgst -sha256 -hex | sed 's/^.* //'
8cb2de97839780a412b93877f8507ad6c94f73add17d5d7058e91741c9d5ec78
2. Slave的配置及加入到集群
2.1 Slave节点的环境准备
参考master节点的环境准备,安装docker及kubeadm等;
2.2 加入slave节点kubeadm join
先在master节点获得token信息、discovery-token-ca-cert-hash信息,然后在slave节点,切换到root用户执行kubeadm join,格式参考如下,注意替换相应参数:
kubeadm join 192.168.141.130:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:cab7c86212535adde6b8d1c7415e81847715cfc8629bb1d270b601744d662515
3. 配置集群网络
容器网络是容器选择连接到其他容器、主机和外部网络的机制。容器的 runtime 提供了各种网络模式,CNI(Container Network Interface) 是一个标准的,通用的接口。在容器平台,Docker,Kubernetes,Mesos 容器网络解决方案 flannel,calico,weave。只要提供一个标准的接口,就能为同样满足该协议的所有容器平台提供网络功能,而 CNI 正是这样的一个标准接口协议。
在 Kubernetes 中,kubelet 可以在适当的时间调用它找到的插件,为通过 kubelet 启动的 pod进行自动的网络配置。
Kubernetes 中可选的 CNI 插件如下:
- Flannel
- Calico
- Canal
- Weave
3.1 查看pod状态
kubectl get pod -n kube-system -o wide
3.2 安装网络插件Calico
Calico 为容器和虚拟机提供了安全的网络连接解决方案,并经过了大规模生产验证(在公有云和跨数千个集群节点中),可与 Kubernetes,OpenShift,Docker,Mesos,DC / OS 和 OpenStack 集成。
Calico 还提供网络安全规则的动态实施。使用 Calico 的简单策略语言,您可以实现对容器,虚拟机工作负载和裸机主机端点之间通信的细粒度控制
3.2.1 安装集群 网络插件 Calico
- 安装网络
# master是使用最新版本的calico插件,也可以指定具体版本号,注意版本兼容
kubectl apply -f https://docs.projectcalico.org/master/manifests/calico.yaml
# 观察网络插件是否启动成功
watch kubectl get pods --all-namespaces
3.2.2 查看网络插件 Calico 是否跑起来
kubectl get pods --all-namespaces
4. k8s pod运行状态的检查
4.1 查看当前pod
kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-bb49cbdfb-j4dt2 1/1 Running 0 41s 192.168.189.1 computer-11 <none> <none>
calico-node-6qvzc 1/1 Running 0 41s 192.168.1.9 computer-9 <none> <none>
calico-node-hwnhr 1/1 Running 0 41s 192.168.1.11 computer-11 <none> <none>
calico-node-vtwzm 1/1 Running 0 41s 192.168.1.10 computer-10 <none> <none>
coredns-7f89b7bc75-tkmms 1/1 Running 0 45m 192.168.198.2 computer-9 <none> <none>
coredns-7f89b7bc75-vv96v 1/1 Running 0 45m 192.168.198.1 computer-9 <none> <none>
etcd-computer-9 1/1 Running 0 45m 192.168.1.9 computer-9 <none> <none>
kube-apiserver-computer-9 1/1 Running 0 45m 192.168.1.9 computer-9 <none> <none>
kube-controller-manager-computer-9 1/1 Running 0 45m 192.168.1.9 computer-9 <none> <none>
kube-proxy-d872f 1/1 Running 0 27m 192.168.1.10 computer-10 <none> <none>
kube-proxy-ft7kl 1/1 Running 0 45m 192.168.1.9 computer-9 <none> <none>
kube-proxy-n9fnj 1/1 Running 0 28m 192.168.1.11 computer-11 <none> <none>
kube-scheduler-computer-9 1/1 Running 0 45m 192.168.1.9 computer-9 <none> <none>
4.2 查看pod的状态
kubectl describe pod calico-node-fkfgd -n kube-system
Name: calico-node-fkfgd
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: computer-9/192.168.1.9
Start Time: Fri, 29 Jan 2021 16:48:52 +0800
Labels: controller-revision-hash=74dc975d6d
k8s-app=calico-node
pod-template-generation=1
Annotations: <none>
Status: Running
IP: 192.168.1.9
IPs:
IP: 192.168.1.9
Controlled By: DaemonSet/calico-node
Init Containers:
upgrade-ipam:
Container ID: docker://c8f04bef35ce1fa59d7d0feb98fe9908e838fec01dac72e5d92d2661f8f865f9
Image: docker.io/calico/cni:master
Image ID: docker-
........................
4.3 查看pod的运行log
kubectl logs calico-node-fkfgd -n kube-system
2021-02-01 02:24:57.882 [INFO][9] startup/startup.go 383: Early log level set to info
2021-02-01 02:24:57.882 [INFO][9] startup/startup.go 399: Using NODENAME environment for node name computer-9
2021-02-01 02:24:57.882 [INFO][9] startup/startup.go 411: Determined node name: computer-9
2021-02-01 02:24:57.882 [INFO][9] startup/startup.go 103: Starting node computer-9 with version v3.18.0-0.dev-102-ge0f7235846ba
2021-02-01 02:24:57.884 [INFO][9] startup/startup.go 443: Checking datastore connection
2021-02-01 02:25:27.885 [INFO][9] startup/startup.go 458: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: i/o timeout
2021-02-01 02:25:58.886 [INFO][9] startup/startup.go 458: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: i/o timeout
参考链接:
https://blog.csdn.net/csdn_welearn/article/details/91419124
https://www.cnblogs.com/zhangsi-lzq/p/14279997.html
更多推荐
所有评论(0)