AlertManager配置钉钉告警告警分组和告警抑制、查看Kubernetes集群资源使用情况和日志、维护K8s集群CA证书、Kubernetes集群版本升级和K8s集群版本升级实操
群--> 设置 --> 智能群助手 -- > 添加机器人安全设置 加签复制 webhook地址和加签字符串,等会配置文件里使用。
一、AlertManager配置钉钉告警
1.创建钉钉机器人
群--> 设置 --> 智能群助手 -- > 添加机器人
安全设置 加签
复制 webhook地址和加签字符串,等会配置文件里使用
2.部署prometheus-webhook-dingtalk
prometheus-webhook-dingtalk为一个实现钉钉告警的插件
github地址:https://github.com/timonwong/prometheus-webhook-dingtalk
注意,以下为二进制方式部署,并没有部署到k8s里
wget https://github.com/timonwong/prometheus-webhookdingtalk/
releases/download/v2.0.0/prometheus-webhook-dingtalk-2.0.0.linux-amd64.tar.gz
tar zxf prometheus-webhook-dingtalk-2.0.0.linux-amd64.tar.gz -C /opt
ln -s /opt/prometheus-webhook-dingtalk-2.0.0.linux-amd64 /opt/prometheus-webhookdingtalk
定义systemd服务管理脚本
vi /lib/systemd/system/prometheus-webhook.service
[Unit]
Description=Prometheus Dingding Webhook
[Service]
ExecStart=/opt/prometheus-webhook-dingtalk/prometheus-webhook-dingtalk --
config.file=/opt/prometheus-webhook-dingtalk/config.yml
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target
vi /opt/prometheus-webhook-dingtalk/config.yml
## Request timeout
# timeout: 5s
## Uncomment following line in order to write template from scratch (be careful!)
#no_builtin_template: true
## Customizable templates path
templates:
- /opt/prometheus-webhook-dingtalk/ding.tmpl
## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
#default_message:
# title: '{{ template "legacy.title" . }}'
# text: '{{ template "legacy.content" . }}'
## Targets, previously was known as "profiles"
targets:
webhook1:
url: https://oapi.dingtalk.com/robot/send?
access_token=2ecc0a32597e069cbd7835762f512937fd4ba666e031e6d63003a80942b1d333
# secret for signature
secret: SEC02cbb7113c24fa87aeaac20687d843345883af1b5138e70e1c19d2b6a54b4588
message:
title: '{{ template "ops.title" . }}' # 给这个webhook应用上 模板标题 (ops.title是
我们模板文件中的title 可在下面给出的模板文件中看到)
text: '{{ template "ops.content" . }}' # 给这个webhook应用上 模板内
容 (ops.content是我们模板文件中的content 可在下面给出的模板文件中看到)
定义模板文件
vi /opt/prometheus-webhook-dingtalk/ding.tmpl
{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end
}}]
{{ end }}
{{ define "__alert_list" }}{{ range . }}
---
**告警类型**: {{ .Labels.alertname }}
**告警级别**: {{ .Labels.severity }}
**故障主机**: {{ .Labels.instance }}
**告警信息**: {{ .Annotations.description }}
**触发时间**: {{ (.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
{{ end }}{{ end }}
{{ define "__resolved_list" }}{{ range . }}
---
**告警类型**: {{ .Labels.alertname }}
**告警级别**: {{ .Labels.severity }}
**故障主机**: {{ .Labels.instance }}
**触发时间**: {{ (.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
**恢复时间**: {{ (.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
{{ end }}{{ end }}
{{ define "ops.title" }}
{{ template "__subject" . }}
{{ end }}
{{ define "ops.content" }}
{{ if gt (len .Alerts.Firing) 0 }}
**====侦测到{{ .Alerts.Firing | len }}个故障====**
{{ template "__alert_list" .Alerts.Firing }}
---
{{ end }}
{{ if gt (len .Alerts.Resolved) 0 }}
**====恢复{{ .Alerts.Resolved | len }}个故障====**
{{ template "__resolved_list" .Alerts.Resolved }}
{{ end }}
{{ end }}
{{ define "ops.link.title" }}{{ template "ops.title" . }}{{ end }}
{{ define "ops.link.content" }}{{ template "ops.content" . }}{{ end }}
{{ template "ops.title" . }}
{{ template "ops.content" . }}
启动服务
systemctl daemon-reload
systemctl enable prometheus-webhook.service
systemctl start prometheus-webhook.service
3.创建endpoint
由于prometheus-webhook-dingtalk为k8s外面的服务,要想让k8s里的pod直接使用最好是创建一个endpoint,vi prometheus-webhook-dingtalk.yaml #内容如下
apiVersion: v1
kind: Endpoints
metadata:
name: dingtalk
subsets:
- addresses:
- ip: 192.168.222.101
ports:
- port: 8060
---
apiVersion: v1
kind: Service ##注意,该service里并不需要定义selector,只要Service name和Endpoint name保持
一致即可
metadata:
name: dingtalk
spec:
ports:
- port: 8060
使其生效
kubectl apply -f prometheus-webhook-dingtalk.yaml
4.配置Alertmanager
vi alertmanager_config.yaml #改为
apiVersion: v1
data:
alertmanager.yaml: |
global:
resolve_timeout: 5m
templates:
- '/bitnami/alertmanager/data/template/ding.tmpl'
receivers:
- name: 'dingtalk_webhook'
webhook_configs:
- url: 'http://dingtalk.default.svc.cluster.local:8060/dingtalk/webhook1/send'
send_resolved: true
route:
group_wait: 10s
group_interval: 5m
repeat_interval: 3h
receiver: 'dingtalk_webhook'
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: prometheus
labels:
app.kubernetes.io/component: alertmanager
app.kubernetes.io/instance: prometheus
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: prometheus
helm.sh/chart: prometheus-0.1.3
name: prometheus-alertmanager
由于Alertmanager有挂载到nfs,所以/bitnami/alertmanager/data/目录对应到nfs里
以下操作在NFS服务端上操作
cd /data/nfs2/default-data-bitnami-prometheus-alertmanager-0-pvc-47ca7949-a84b-4d72-
bdff-f9380f4f2fa1/template
将prometheus-webhook-dingtalk那台机器上的模板文件/opt/prometheus-webhook-dingtalk/ding.tmpl复制到当前目录下
二、AlertManager告警分组和告警抑制
1.告警分组
为了避免告警轰炸,将同类型的告警规则定位一组,比如将所有硬件相关的都归类到hardware,包括负载、cpu使用率、内存使用率、硬盘等。当此类告警被触发,在一个“group_wait”时间范围内,会被汇集成一个通知发出来,比如只发一条微信消息或者一封邮件。
案例一:最简单的分组,通过alertname来分组
关键配置项:
1)alertmanager_config.yml
group_by: ['alertname']
这个alertname 指的是告警规则里alert参数定义的值。
2)prometheus 的 告警规则配置文件
vi prometheus_config.yaml #找到rules.yaml,编辑对应的部分
groups:
- name: hardware
rules:
- alert: hardware
expr: node_load1 > 4
for: 1m
labels:
severity: Critical
annotations:
summary: "{{ $labels.instance }} 负载为 {{ $value }} 比较高 "
description: "主机1分钟负载超过4"
value: "{{ $value }}"
- alert: hardware
expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 1m
labels:
severity: Critical
annotations:
summary: "{{$labels.instance}} CPU使用率为{{ $value }} 太高"
description: "{{$labels.instance }} CPU使用大于80%"
value: "{{ $value }}%"
- alert: hardware
expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes +
node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100
> 85
for: 1m
labels:
severity: Critical
annotations:
summary: "{{$labels.instance}} 内存使用率 {{ $value }}% 过高!"
description: "{{$labels.instance }} 内存使用大于85%"
value: "{{ $value }}%"
案例二:通过alertname + 额外标签区分
关键配置项:
1)alertmanager_config.yml
group_by: ['alertname','team']
这个alertname 指的是告警规则里alert参数定义的值。
2)prometheus 的 告警规则配置文件
vi prometheus_config.yaml #找到rules.yaml,编辑对应的部分
groups:
- name: hardware
rules:
- alert: hardware
expr: node_load1 > 4
for: 1m
labels:
severity: Critical
team: cpu
annotations:
summary: "{{ $labels.instance }} 负载为 {{ $value }} 比较高 "
description: "主机1分钟负载超过4"
value: "{{ $value }}"
- alert: hardware
expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 1m
labels:
severity: Critical
team: cpu
annotations:
summary: "{{$labels.instance}} CPU使用率为{{ $value }} 太高"
description: "{{$labels.instance }} CPU使用大于80%"
value: "{{ $value }}%"
- alert: hardware
expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes +
node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100
> 85
for: 1m
labels:
severity: Critical
team: mem
annotations:
summary: "{{$labels.instance}} 内存使用率 {{ $value }}% 过高!"
description: "{{$labels.instance }} 内存使用大于85%"
value: "{{ $value }}%"
2.抑制
抑制是当出现其它告警的时候压制当前告警的通知,可以有效的防止告警风暴。
1)示例1:
将如下配置增加到alertmanager_config.yaml里
inhibit_rules:
- source_match: ## 来源匹配器,这里表示当发现alertname标签为NodeDown,并且告警级别为
critical时的告警通知发出,则会对target_matchers中匹配的通知进行抑制。
alertname: NodeDown
severity: Critical
target_match: ## 目标匹配器,这里表示对告警级别为critical的告警进行抑制
severity: Critical
equal:
- node ## 表示,来源通知和抑制目标通知需要具有相同的node标签
解释:当集群中的某一个主机节点异常宕机导致告警NodeDown被触发,同时在告警规则中定义了告警级别severity=Critical。由于主机异常宕机,该主机上部署的所有服务,
中间件会不可用并触发报警。根据抑制规则的定义,如果有新的告警级别为severity=Critical,并且告警中标签
node的值与NodeDown告警的相同,则说明新的告警是由NodeDown导致的,则启动抑制机制停止向接收器发送通知。
2)示例2:
将如下配置增加到alertmanager_config.yaml里
inhibit_rules:
- source_match:
alertname: NodeMemoryUsage
severity: Critical
target_match:
severity: Normal
equal:
- instance
解释:告警规则名字为NodeMemroyUsage,并且告警级别severity为Critical的告警被触发时,则告警级别为Normal,并且告警中标签instance的值与NodeMemoryUsage相同的告警会被抑制,不再发送通知。
3)实战
告警规则:
vi prometheus_config.yaml #找到rules.yaml,编辑对应的部分
groups:
- name: hardware
rules:
- alert: systemLoad
expr: node_load1 > 4
for: 1m
labels:
severity: Critical
team: cpu
annotations:
summary: "{{ $labels.instance }} 负载为 {{ $value }} 比较高 "
description: "主机1分钟负载超过4"
value: "{{ $value }}"
- alert: cpuUsage
expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 1m
labels:
severity: Critical
team: cpu
annotations:
summary: "{{$labels.instance}} CPU使用率为{{ $value }} 太高"
description: "{{$labels.instance }} CPU使用大于80%"
value: "{{ $value }}%"
- alert: MemUsage
expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes +
node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100
> 85
for: 1m
labels:
severity: Critical
team: mem
annotations:
summary: "{{$labels.instance}} 内存使用率 {{ $value }}% 过高!"
description: "{{$labels.instance }} 内存使用大于85%"
value: "{{ $value }}%"
抑制规则
alertmanager_config.yml
inhibit_rules:
- source_match:
alertname: MemUsage
severity: Critical
target_match:
team: cpu
equal:
- instance
模拟告警
修改prometheus_config.yaml,将>改为<
三、查看Kubernetes集群资源使用情况和日志
1. 查看资源使用情况
1)kubectl top查看Node使用CPU和内存情况
kubectl top node #查看所有node
kubectl top node k8s01 #查看指定node
[root@aminglinux01 ~]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
aminglinux01 156m 3% 1843Mi 52%
aminglinux02 3954m 98% 2143Mi 60%
aminglinux03 3951m 98% 1806Mi 51%
[root@aminglinux01 ~]# kubectl top node aminglinux02
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
aminglinux02 3959m 98% 2143Mi 60%
[root@aminglinux01 ~]#
2)kubectl top查看Pod使用CPU和内存情况
kubectl top pod #查看所有Pod
kubectl top pod php-apache-64b6b9d449-t9h4z #查看指定Pod
[root@aminglinux01 ~]# kubectl top pod
NAME CPU(cores) MEMORY(bytes)
grafana-784469b9b9-4htvz 1m 117Mi
kube-state-metrics-75778cdfff-2vdkl 1m 32Mi
lucky-6cdcf8b9d4-t5r66 1m 74Mi
myharbor-core-b9d48ccdd-v9jdz 1m 66Mi
myharbor-jobservice-6f5dbfcc4f-q852z 1m 28Mi
myharbor-nginx-65b8c5764d-vz4vn 1m 10Mi
myharbor-portal-ff7fd4949-lj6jw 1m 6Mi
myharbor-postgresql-0 7m 66Mi
myharbor-redis-master-0 16m 20Mi
myharbor-registry-5b59458d9-4j79b 1m 29Mi
myharbor-trivy-0 1m 11Mi
nginx 0m 18Mi
node-exporter-9cn2c 1m 17Mi
node-exporter-h4ntw 1m 18Mi
node-exporter-wvp2h 1m 17Mi
pod-demo 1m 72Mi
pod-demo1 0m 4Mi
prometheus-alertmanager-0 0m 0Mi
prometheus-consul-0 20m 56Mi
prometheus-consul-1 27m 51Mi
prometheus-consul-2 27m 53Mi
prometheus-nginx-exporter-bbf5d8b8b-s8hvl 1m 13Mi
prometheus-server-755b857b5-fbw6m 0m 0Mi
redis-sts-0 1m 7Mi
redis-sts-1 1m 8Mi
[root@aminglinux01 ~]# kubectl top pod nginx
NAME CPU(cores) MEMORY(bytes)
nginx 0m 18Mi
[root@aminglinux01 ~]#
注意: top功能需要先安装metrics-server,安装步骤参考HPA那一节课
2.查看日志
1)K8s相关日志
Linux系统里记录的日志
journalctl -u kubelet
[root@aminglinux01 ~]# journalctl -u kubelet
-- Logs begin at Tue 2024-07-30 21:49:20 CST, end at Fri 2024-08-09 00:46:26 CST. --
Jul 30 21:49:23 aminglinux01 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jul 30 21:49:23 aminglinux01 kubelet[896]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collecto>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.545132 896 server.go:198] "--pod-infra-container-image will not be pruned by the i>
Jul 30 21:49:23 aminglinux01 kubelet[896]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collecto>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.556332 896 server.go:412] "Kubelet version" kubeletVersion="v1.26.2"
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.556354 896 server.go:414] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.556515 896 server.go:836] "Client rotation is on, will bootstrap in background"
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.559518 896 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.570107 896 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bu>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.575271 896 server.go:659] "--cgroups-per-qos enabled, but --cgroup-root was not sp>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.576724 896 container_manager_linux.go:267] "Container manager verified user specif>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.576943 896 container_manager_linux.go:272] "Creating Container Manager object base>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.577418 896 topology_manager.go:134] "Creating topology manager with policy per sco>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.577445 896 container_manager_linux.go:308] "Creating device plugin manager"
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.577965 896 state_mem.go:36] "Initialized new in-memory state store"
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.598045 896 kubelet.go:398] "Attempting to sync node with API server"
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.598103 896 kubelet.go:286] "Adding static pod path" path="/etc/kubernetes/manifest>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.598188 896 kubelet.go:297] "Adding apiserver pod source"
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.598270 896 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
Jul 30 21:49:23 aminglinux01 kubelet[896]: W0730 21:49:23.604343 896 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: fai>
Jul 30 21:49:23 aminglinux01 kubelet[896]: E0730 21:49:23.604532 896 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Fai>
Jul 30 21:49:23 aminglinux01 kubelet[896]: W0730 21:49:23.604342 896 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: fai>
Jul 30 21:49:23 aminglinux01 kubelet[896]: E0730 21:49:23.604871 896 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Fai>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.607913 896 kuberuntime_manager.go:244] "Container runtime initialized" containerRu>
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.612371 896 server.go:1186] "Started kubelet"
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.614374 896 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
Jul 30 21:49:23 aminglinux01 kubelet[896]: I0730 21:49:23.614542 896 server.go:161] "Starting to listen" address="0.0.0.0" port=10250
lines 1-28
K8s各组件日志
首先查看Pod name
kubectl get po -n kube-system # calico-kube-controllers-xxxx, calico-node-xxx, corednsxxx,
etcd-xxx, kube-apiserver-xxx, kube-controller-manager-xxx, kube-proxy-xxx, kubescheduler-xxx, metrics-server-xxx
[root@aminglinux01 ~]# kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-57b57c56f-h2znw 1/1 Running 6 (9d ago) 31d
calico-node-4wnj6 1/1 Running 0 9d
calico-node-9ltgn 1/1 Running 0 9d
calico-node-m9rq9 1/1 Running 0 9d
coredns-567c556887-pqv8h 1/1 Running 10 (9d ago) 34d
coredns-567c556887-vgsth 1/1 Running 10 (9d ago) 34d
etcd-aminglinux01 1/1 Running 10 (9d ago) 34d
kube-apiserver-aminglinux01 1/1 Running 10 (9d ago) 34d
kube-controller-manager-aminglinux01 1/1 Running 14 (8d ago) 34d
kube-proxy-24g6p 1/1 Running 1 (9d ago) 9d
kube-proxy-bklnw 1/1 Running 1 (9d ago) 9d
kube-proxy-h8vx5 1/1 Running 1 (9d ago) 9d
kube-scheduler-aminglinux01 1/1 Running 13 (8d ago) 34d
metrics-server-76467d945-9jtxx 1/1 Running 0 3d7h
metrics-server-76467d945-vnjxl 1/1 Running 0 3d7h
nfs-client-provisioner-d79cfd7f6-q2n4z 1/1 Running 3 (8d ago) 30d
[root@aminglinux01 ~]#
查看指定Pod日志
kubectl logs -n kube-system calico-kube-controllers-798cc86c47-44525
kubectl logs -n kube-system kube-scheduler-k8s01
[root@aminglinux01 ~]# kubectl logs -n kube-system calico-kube-controllers-57b57c56f-h2znw
2024-07-30 13:49:34.846 [INFO][1] main.go 107: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0730 13:49:34.847864 1 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2024-07-30 13:49:34.848 [INFO][1] main.go 131: Ensuring Calico datastore is initialized
2024-07-30 13:49:34.854 [INFO][1] main.go 157: Calico datastore is initialized
2024-07-30 13:49:34.855 [INFO][1] main.go 194: Getting initial config snapshot from datastore
2024-07-30 13:49:34.898 [INFO][1] main.go 197: Got initial config snapshot
2024-07-30 13:49:34.898 [INFO][1] watchersyncer.go 89: Start called
2024-07-30 13:49:34.898 [INFO][1] main.go 211: Starting status report routine
2024-07-30 13:49:34.898 [INFO][1] main.go 220: Starting Prometheus metrics server on port 9094
2024-07-30 13:49:34.898 [INFO][1] main.go 503: Starting informer informer=&cache.sharedIndexInformer{indexer:(*cache.cache)(0xc00000e4e0), controller:cache.Controller(nil), processor:(*cache.sharedProcessor)(0xc0004202a0), cacheMutationDetector:cache.dummyMutationDetector{}, listerWatcher:(*cache.ListWatch)(0xc00000e4c8), objectType:(*v1.Pod)(0xc0002a8800), resyncCheckPeriod:0, defaultEventHandlerResyncPeriod:0, clock:(*clock.RealClock)(0x3029630), started:false, stopped:false, startedLock:sync.Mutex{state:0, sema:0x0}, blockDeltas:sync.Mutex{state:0, sema:0x0}, watchErrorHandler:(cache.WatchErrorHandler)(nil), transform:(cache.TransformFunc)(nil)}
2024-07-30 13:49:34.898 [INFO][1] main.go 503: Starting informer informer=&cache.sharedIndexInformer{indexer:(*cache.cache)(0xc00000e570), controller:cache.Controller(nil), processor:(*cache.sharedProcessor)(0xc000420310), cacheMutationDetector:cache.dummyMutationDetector{}, listerWatcher:(*cache.ListWatch)(0xc00000e558), objectType:(*v1.Node)(0xc0001f6300), resyncCheckPeriod:0, defaultEventHandlerResyncPeriod:0, clock:(*clock.RealClock)(0x3029630), started:false, stopped:false, startedLock:sync.Mutex{state:0, sema:0x0}, blockDeltas:sync.Mutex{state:0, sema:0x0}, watchErrorHandler:(cache.WatchErrorHandler)(nil), transform:(cache.TransformFunc)(nil)}
[root@aminglinux01 ~]# kubectl logs -n kube-system kube-controller-manager-aminglinux01 | more
I0808 00:06:26.230450 1 event.go:294] "Event occurred" object="default/storage-prometheus-alertmanager-0" fieldPath="" kind="PersistentVolu
meClaim" apiVersion="v1" type="Normal" reason="FailedBinding" message="no persistent volumes available for this claim and no storage class is set
"
I0808 00:06:41.231560 1 event.go:294] "Event occurred" object="default/storage-prometheus-alertmanager-0" fieldPath="" kind="PersistentVolu
meClaim" apiVersion="v1" type="Normal" reason="FailedBinding" message="no persistent volumes available for this claim and no storage class is set
"
I0808 00:06:56.231777 1 event.go:294] "Event occurred" object="default/storage-prometheus-alertmanager-0" fieldPath="" kind="PersistentVolu
meClaim" apiVersion="v1" type="Normal" reason="FailedBinding" message="no persistent volumes available for this claim and no storage class is set
"
I0808 00:07:11.233084 1 event.go:294] "Event occurred" object="default/storage-prometheus-alertmanager-0" fieldPath="" kind="PersistentVolu
meClaim" apiVersion="v1" type="Normal" reason="FailedBinding" message="no persistent volumes available for this claim and no storage class is set
"
I0808 00:07:26.234166 1 event.go:294] "Event occurred" object="default/storage-prometheus-alertmanager-0" fieldPath="" kind="PersistentVolu
meClaim" apiVersion="v1" type="Normal" reason="FailedBinding" message="no persistent volumes available for this claim and no storage class is set
"
I0808 00:07:41.234955 1 event.go:294] "Event occurred" object="default/storage-prometheus-alertmanager-0" fieldPath="" kind="PersistentVolu
meClaim" apiVersion="v1" type="Normal" reason="FailedBinding" message="no persistent volumes available for this claim and no storage class is set
"
另外,可以加上-f选项动态查看指定pod日志,类似tail -f
2)应用日志
跟查看K8s组件日志一样,将Pod名字改为想查看的Pod名字即可
kubectl logs php-apache-64b6b9d449-t9h4z
[root@aminglinux01 ~]# kubectl logs nginx | more
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2024/08/02 20:26:48 [notice] 1#1: using the "epoll" event method
2024/08/02 20:26:48 [notice] 1#1: nginx/1.27.0
2024/08/02 20:26:48 [notice] 1#1: built by gcc 12.2.0 (Debian 12.2.0-14)
2024/08/02 20:26:48 [notice] 1#1: OS: Linux 4.18.0-553.el8_10.x86_64
2024/08/02 20:26:48 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/08/02 20:26:48 [notice] 1#1: start worker processes
2024/08/02 20:26:48 [notice] 1#1: start worker process 29
2024/08/02 20:26:48 [notice] 1#1: start worker process 30
2024/08/02 20:26:48 [notice] 1#1: start worker process 31
2024/08/02 20:26:48 [notice] 1#1: start worker process 32
2024/08/02 20:31:33 [notice] 1#1: signal 1 (SIGHUP) received from 273, reconfiguring
另外,也可以进入到Pod内部去查看应用日志
kubectl exec -it pod-name -n namespace-name -- bash ##进入后,再去查看具体的日志
[root@aminglinux01 ~]# kubectl exec -it nginx -n default -- bash
root@nginx:/#
有时候,我们的应用也会将日志目录给映射到Node上或者共享存储里,那样查看日志就方便多了。
四、维护Kubernetes集群CA证书
1. Kubernetes集群中的CA证书
如果使用Kubeadm部署集群,CA证书会自动生成,但如果用二进制方式部署则需要手动生成。
服务器上CA证书在哪里?
[root@aminglinux01 ~]# tree /etc/kubernetes/pki/
/etc/kubernetes/pki/
├── apiserver.crt
├── apiserver-etcd-client.crt
├── apiserver-etcd-client.key
├── apiserver.key
├── apiserver-kubelet-client.crt
├── apiserver-kubelet-client.key
├── ca.crt
├── ca.key
├── etcd
│ ├── ca.crt
│ ├── ca.key
│ ├── healthcheck-client.crt
│ ├── healthcheck-client.key
│ ├── peer.crt
│ ├── peer.key
│ ├── server.crt
│ └── server.key
├── front-proxy-ca.crt
├── front-proxy-ca.key
├── front-proxy-client.crt
├── front-proxy-client.key
├── sa.key
└── sa.pub1 directory, 22 files
[root@aminglinux01 ~]#
Kubernetes为了安全,使用的是双向认证( 除了客户端需要验证服务器的证书,服务器也要通过客户端证书验证客户端的身份。)
1.1 CA证书
kubeadm安装的集群中我们都是用3套CA证书来管理和签发其他证书,一套CA给ETCD使用,一套是给kubernates内部组件使用,还有一套是给配置聚合层使用的,当然如果你觉得管理3套CA比较麻烦,您也可以用一套来管理。
1)Etcd证书
Etcd证书位于/etc/kubernetes/pki/etcd目录下,可以用ps查看Etcd的进程以及参数:
[root@aminglinux01 ~]# ps aux| grep etcd | grep -v 'kube-apiserver'
root 1875 1.9 3.0 11284548 115176 ? Ssl Jul30 258:11 etcd --advertise-client-urls=https://192.168.100.151:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --experimental-initial-corrupt-check=true --experimental-watch-progress-notify-interval=5s --initial-advertise-peer-urls=https://192.168.100.151:2380 --initial-cluster=aminglinux01=https://192.168.100.151:2380 --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://192.168.100.151:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://192.168.100.151:2380 --name=aminglinux01 --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
root 1759389 0.0 0.0 222012 1128 pts/1 S+ 00:59 0:00 grep --color=auto etcd
[root@aminglinux01 ~]#
证书以及说明
├── etcd
│ ├── ca.crt ## 用于Etcd集群节点之间相互认证的CA证书
│ ├── ca.key ## 同上
│ ├── healthcheck-client.crt ## 当Etcd访问其它服务时,它作为客户端使用的CA证书
│ ├── healthcheck-client.key ## 同上
│ ├── peer.crt ## Etcd集群节点之间相互认证的peer证书,这是公钥
│ ├── peer.key ## 同上,这是私钥
│ ├── server.crt ## Etcd对外提供服务时,比如apiserver连接etcd时,它作为服务端的CA证书,这是公钥
│ └── server.key ## 同上,这是私钥
2) Kube-apiserver证书
Apiserver对应的证书目录在/etc/kubernetes/pki,可以用ps查看进程
ps aux |grep apiserver
[root@aminglinux01 ~]# ps aux |grep apiserver
root 1873 3.7 13.0 1335780 484700 ? Ssl Jul30 490:00 kube-apiserver --advertise-address=192.168.100.151 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-account-signing-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.15.0.0/16 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
证书以及说明:
tree /etc/kubernetes/pki/ /etc/kubernetes/pki/
├── apiserver.crt ##Apiserver作为服务端用到的CA证书
├── apiserver.key ##同上
├── apiserver-etcd-client.crt ##Apiserver作为客户端访问Etcd服务时用到的CA证书
├── apiserver-etcd-client.key ##同上
├── apiserver-kubelet-client.crt ##Apiserver访问kublet时,它作为客户端用到的证书
├── apiserver-kubelet-client.key ##同上
├── ca.crt ##用来签发k8s中其它证书CA证书,是一个根证书
├── ca.key ##同上
├── front-proxy-ca.crt ##配置聚合层(Apiserver扩展)的CA证书
├── front-proxy-ca.key ##同上
├── front-proxy-client.crt ##置聚合层(Apiserver扩展)的客户端证书
├── front-proxy-client.key ##同上
├── sa.key ##验证service account token用的私钥
└── sa.pub ##验证service account token用的公钥
3) kube-controller-manager用到的证书
查看进程:
ps aux |grep controller
[root@aminglinux01 ~]# ps aux |grep controller
systemd+ 3182 0.0 1.3 1273816 49200 ? Ssl Jul30 3:28 /usr/bin/kube-controllers
root 196750 1.2 2.4 834392 91748 ? Ssl Jul31 162:00 kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf --bind-address=127.0.0.1 --client-ca-file=/etc/kubernetes/pki/ca.crt --cluster-cidr=10.18.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --leader-elect=true --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.15.0.0/16 --use-service-account-credentials=true
说明:
ps看到的进程用到的ca证书如下
/etc/kubernetes/pki/ca.crt
/etc/kubernetes/pki/ca.key
/etc/kubernetes/pki/front-proxy-ca.crt
/etc/kubernetes/pki/sa.key
这些证书其实是Apiserver相关的证书,而kube-controller-manager用到的证书在/etc/kubernetes/controllermanager.conf这个配置文件里
[root@aminglinux01 ~]# cat /etc/kubernetes/controller-manager.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJME1EY3dOREU0TkRnME5sb1hEVE0wTURjd01qRTRORGcwTmxvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTUhnCjBFOHRLVlcvdXJ4QW5vc2VNZ2xKeGlQOGNsMnR6RjdQaTZjVldwdWdkUSt2OWh3Tk9tRWhvRTY0NXJMNG1LaFcKd09lOGF5RjFsUWFPN0pZcE5qQ3FVeURZbjhncW82M3JIbU5NNDh5OFpuQkYwNmVhandSbFZLL0toMCtzVk9UdgpsWGtYclljSkxGMmdwc2YwNGxxQUkzN1BNd1RPZVlUTC9lTHMva0dqWGZ0L2EzenJ0R0xJY09lZmkzWHRhSU84CmFhNnpKMnd4eEwrb0VtdXhyZHE3Qi91SEVMS2U0NGdvNUJmZ3FudEFLeTBtK1pVTlB5VE9OSEtIZnAzN3pEdXoKOWFNeVQwWUJxSFFxRjRZMzlOUFRrRnE4L0hVOGFxcFl0SGFtMWQ0czdTTDVhTDVHUVcvbmlEYS9nMGQrM3dseQpRaUlDbGpWN2k3eUNDaTM2c3ZrQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZIYnFiaXZWd2xPYng5eFFka2JqRXl3citlcitNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQk9xZ0VSU3AwMmxYYVRia1U3VgpITHp4SWdyUXBXdm1oMWJFMGRpTTdhaDgzMW9YRm9PclBqWkZDdnU5eW5KVUxQVkQwa0ZoaCtBanlIT09JR0FECjE2MlJWZFRmaXpZOXZjcWZsdGg3ZHVGcCtSYnc1UFU2QU5Jc0lOWEVnQzdPNGxuRTQrRWY3bmlBMkt5SGlpUk4KSzhjR3lPc3crVmlRY2JPMVdwRmlKdlkxNmhmTkdueHBvdjhIRjNFRFYrS3VqMm9vc2hyTDR2bGVFM3BxRlYxUQp1enlnZFBVdTdWd00zZ3Q1czNTbmVrTkJ5NFA5T1M3cGo0d0RobkdJb3dLT01maXR6RUdpSEcxNm5BbytwZUdCCnA5T3ZHVWJvK1BJc1NzL1JFeUcyU2grTUFCaHo0dmhDSEJSY2hUdXZjeXJIdHVxdVF3U1VQeDZhWU96Q2FXalMKeVJ3PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server: https://192.168.100.151:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: system:kube-controller-manager
name: system:kube-controller-manager@kubernetes
current-context: system:kube-controller-manager@kubernetes
kind: Config
preferences: {}
users:
- name: system:kube-controller-manager
user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURGakNDQWY2Z0F3SUJBZ0lJVlA0OVpxSHNyQWd3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TkRBM01EUXhPRFE0TkRaYUZ3MHlOVEEzTURReE9EUTRORGxhTUNreApKekFsQmdOVkJBTVRIbk41YzNSbGJUcHJkV0psTFdOdmJuUnliMnhzWlhJdGJXRnVZV2RsY2pDQ0FTSXdEUVlKCktvWklodmNOQVFFQkJRQURnZ0VQQURDQ0FRb0NnZ0VCQUxIZWtrMWxTUlhWaDBsdG1mZkR3RHhwK0JtdHBQdlkKbHBObDFBUktldWg5TmxxY2xSVGxFNXBObXRwRXhIa1lrRGRFYnJYUjZFREFrcldtSkxKdjgwNGZLb1dSeHpmeQpacTc3K25yTTFHTEI5U0xjVWdTQjBrYzRWRWNRNjhsMVBBQVlTcGltYXJMNmM0d3ZuYnV1UzNzSTRyUzlxTERmClNWNFhYQTVtd2F6QnpZSGx0M2lOVWY1bGE1OW1rdW82ajk5Y3ppNzhQb0h3VUFDTTFITW1FYlRteldzSW1OSG0KQWpkWmlOSUxPNVE5N2dWWUNEVHdGeGRIbFRaWVpKT0dId1d2bDQ2UGRaNnh5d3h6cDlqL2VhYU03T1ducmZ1egpmUStGVFlVVHZOWlBxdGNpUG9xbjZsTDZiSlNJSitYRUJGYkxKRlloQVJ0RS9hZ1drK3ZIRkpFQ0F3RUFBYU5XCk1GUXdEZ1lEVlIwUEFRSC9CQVFEQWdXZ01CTUdBMVVkSlFRTU1Bb0dDQ3NHQVFVRkJ3TUNNQXdHQTFVZEV3RUIKL3dRQ01BQXdId1lEVlIwakJCZ3dGb0FVZHVwdUs5WENVNXZIM0ZCMlJ1TVRMQ3Y1NnY0d0RRWUpLb1pJaHZjTgpBUUVMQlFBRGdnRUJBR0dyeGlIWmRvQXF6SkZkZ3NVT010OEgvOW1HTWV2N2JtSHhzbzRYSC9Qc2Z5V3kxNXcvCk5aRjByQ2IrVDJ1YjZ1S1RXc1RFRFJlVGhtaFBWemRpTGRUeHY5Qkd0MGxKRm9CcUljMC9pODNJSEgwU3RhWDcKVUVDTldjZUc1b0hEZGZNbndLSFBRWGM0WUw3TnErSmFKNDcxVitVQ0dKSjJ0Y29RUkdqeTlQV2prWGdWQjBUcwpPQ0xOaGhZdVVvb1E5U0RUeHlHanBSK2NyTm50cEI2WlhnaDZoay93Mm5oYUZIODNRdCttNlJnc3BLcTVHeVZICmczc1BjQU8zNVNIVkZja3ZGRmNxYUFyRDdPZVNkZ2NyQ2tBemxLQithTjhSbjJaMUE1RHNvTThicEFKaTI2NGUKR1FsMVhlb1Vxak1NZTZRVWFDVE83cEdLQVFWUTc4OEdlUnc9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBc2Q2U1RXVkpGZFdIU1cyWjk4UEFQR240R2Eyays5aVdrMlhVQkVwNjZIMDJXcHlWCkZPVVRtazJhMmtURWVSaVFOMFJ1dGRIb1FNQ1N0YVlrc20velRoOHFoWkhITi9KbXJ2djZlc3pVWXNIMUl0eFMKQklIU1J6aFVSeERyeVhVOEFCaEttS1pxc3ZwempDK2R1NjVMZXdqaXRMMm9zTjlKWGhkY0RtYkJyTUhOZ2VXMwplSTFSL21Wcm4yYVM2anFQMzF6T0x2dytnZkJRQUl6VWN5WVJ0T2JOYXdpWTBlWUNOMW1JMGdzN2xEM3VCVmdJCk5QQVhGMGVWTmxoa2s0WWZCYStYam85MW5ySExESE9uMlA5NXBvenM1YWV0KzdOOUQ0Vk5oUk84MWsrcTF5SSsKaXFmcVV2cHNsSWduNWNRRVZzc2tWaUVCRzBUOXFCYVQ2OGNVa1FJREFRQUJBb0lCQVFDVEgwWmtQaUwxckdqNgprMjJIUXFML1ZhZWhsYitoa01UN3BuNTREaU1icW5ZSy9QbFREeWZudWNrY1FVVkI1TTlrNTNXcmJyUnMydHgrCjQ2MzI2aUtWdTdHd1NhUSs0b0dNdTErenN6ajVkdlVNb0xBMmlpc2tQYk40Rk1iekc4VkZUdEprOFVIUVNOaksKVzVoY1pRNko5ZytPOEZGWCsxajBPdDRxQzFTblQwVDFvdUNBWks2aUlZT3VMSGpyaXlaMWozcVRsMVk5VWI3Mgo3dEpscXNXdVdGczk3UnhndG5RMW13TnZtckV2ajUvak9XRDJ4cUJMWEZRRnErTEd4NkR4QzEvS0xmQ0NhT2JVCjNPRlNZUUU0ZXFTTnlzVkNOZzYrRkg1aGloTHExSEloSkZQUzMzdWFXcjcycGVaOGVzdVFQNCsrM2FDN1FmdE0KeUtvZ21pTUJBb0dCQU9DTEJKNVdBbHp4SzFPWnAxNnlPMmYrUGdYd1A0UnJabHZMVVJMODU4Y0VLeXFMZGlCaApMcnR5MFpKZm5xRWgwRHBtaTIvMXJkeHRWeElDSU9WNWhNVFB6aTNxaUhJbHhZOWlLN3kxazk5MFpqRXRxS2VRClM5ckdoK3M4a3U5dFd4YXJJKzBWdXRadHlNaTlCUUZSdUN6dTQ4YWlZMTk5K2dWY25sZHQ5bDFEQW9HQkFNckoKcUN3Q2V3UzhjNlRMTDhHT0hlWktDOURZTVRIN0plM1I0YVNtNnZoY2tVYmkyMzAxL2VjTS8yR3hjMlpwcU41TQplRC9Ed2FxMkM1RHlWQVRCak0wWTlVZjliVEprMnM5NC8yMHVsNk1ra2VSN1V2VnV3R25TV0JMWEh6dDZtaEpCCnBoMkRNUU9hYnkyVkpaNjFjQ1h2M0ZXTWlYOEtpbnhPSzgvS2tKK2JBb0dCQUtRVDBZY2wwRW84REczbFhKRHMKNmszK1VUSWpzVFpCQ0tYUTl1aEtGOCtzY3lKK2tBM3ZGYWZ4cWNRc3pReHZXZW9pM29jc1hpUXhYYXVTRkptNgpaMU10aWpxeEk0MU5ub1E4dHpzSThBb2IwMFRpV1ZoQUw3Mm96czhORDAyWGVqVWhUM3BDSTZubXhRNHlXUUx5CnhRTkllUGEvMkorQnZYM0hoUWpjR0dkakFvR0FCYlY1bzR1S1ZRN05IcVdOdWFBN25VRVdaaEhBQ00wdU95eSsKY25rMGdqdHc1NUw1Wk9RQk91RDF5NVZJVDJqSUZVSUgzSnV4TnhJYTcwQ3pOdE1RR0xJTUxiT253RlJ3aUlpNgpnQ05ncDNvZkZWU1hlRXRlNVZ2RG1Qd3ZaK2hDc0NMaS8wK3pNSXZIZDN3TWJCUmxqTnZjMHhlNnd6WFR3ajRkCkk2TnJRT01DZ1lBcnNNanRxMCtvbEpTeVBXekxkWUo3Q0Y5RDdOV0VnRnRwOGhFNllzVUZoSDFZZy9EVXA1YWsKaHBvQm1xUVo5VWErY0FjZFlkd0xDZkh2aVorTEtiMDJtY3NDVWdBNWFjcFpjSjlwcUNhL2dEU2taYXZ1Um9vSwpoUTh3QWRSenlWVWplLzZxbHpPdyt4bEo5eEVpWURZYm01VTNaZVZJVGZlNVZHejE0cytHZ3c9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=
[root@aminglinux01 ~]#
Kubernetes 这里的设计是这样的:kube-controller-mananger、kube-scheduler、kube-proxy、kubelet等组件,采用一个kubeconfig 文件中配置的信息来访问 kube-apiserver。该文件中包含了 kube-apiserver 的地址,验证 kube-apiserver 服务器证书的 CA 证书,自己的客户端证书和私钥等访问信息。
4)Kube-scheduler
跟Kube-controller-namager一样,Kube-scheduler用的也是kubeconfig
[root@aminglinux01 ~]# cat /etc/kubernetes/scheduler.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJME1EY3dOREU0TkRnME5sb1hEVE0wTURjd01qRTRORGcwTmxvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTUhnCjBFOHRLVlcvdXJ4QW5vc2VNZ2xKeGlQOGNsMnR6RjdQaTZjVldwdWdkUSt2OWh3Tk9tRWhvRTY0NXJMNG1LaFcKd09lOGF5RjFsUWFPN0pZcE5qQ3FVeURZbjhncW82M3JIbU5NNDh5OFpuQkYwNmVhandSbFZLL0toMCtzVk9UdgpsWGtYclljSkxGMmdwc2YwNGxxQUkzN1BNd1RPZVlUTC9lTHMva0dqWGZ0L2EzenJ0R0xJY09lZmkzWHRhSU84CmFhNnpKMnd4eEwrb0VtdXhyZHE3Qi91SEVMS2U0NGdvNUJmZ3FudEFLeTBtK1pVTlB5VE9OSEtIZnAzN3pEdXoKOWFNeVQwWUJxSFFxRjRZMzlOUFRrRnE4L0hVOGFxcFl0SGFtMWQ0czdTTDVhTDVHUVcvbmlEYS9nMGQrM3dseQpRaUlDbGpWN2k3eUNDaTM2c3ZrQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZIYnFiaXZWd2xPYng5eFFka2JqRXl3citlcitNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQk9xZ0VSU3AwMmxYYVRia1U3VgpITHp4SWdyUXBXdm1oMWJFMGRpTTdhaDgzMW9YRm9PclBqWkZDdnU5eW5KVUxQVkQwa0ZoaCtBanlIT09JR0FECjE2MlJWZFRmaXpZOXZjcWZsdGg3ZHVGcCtSYnc1UFU2QU5Jc0lOWEVnQzdPNGxuRTQrRWY3bmlBMkt5SGlpUk4KSzhjR3lPc3crVmlRY2JPMVdwRmlKdlkxNmhmTkdueHBvdjhIRjNFRFYrS3VqMm9vc2hyTDR2bGVFM3BxRlYxUQp1enlnZFBVdTdWd00zZ3Q1czNTbmVrTkJ5NFA5T1M3cGo0d0RobkdJb3dLT01maXR6RUdpSEcxNm5BbytwZUdCCnA5T3ZHVWJvK1BJc1NzL1JFeUcyU2grTUFCaHo0dmhDSEJSY2hUdXZjeXJIdHVxdVF3U1VQeDZhWU96Q2FXalMKeVJ3PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server: https://192.168.100.151:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: system:kube-scheduler
name: system:kube-scheduler@kubernetes
current-context: system:kube-scheduler@kubernetes
kind: Config
preferences: {}
users:
- name: system:kube-scheduler
user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUREVENDQWZXZ0F3SUJBZ0lJSXluR0RrVUxaM3d3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TkRBM01EUXhPRFE0TkRaYUZ3MHlOVEEzTURReE9EUTRORGxhTUNBeApIakFjQmdOVkJBTVRGWE41YzNSbGJUcHJkV0psTFhOamFHVmtkV3hsY2pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCCkJRQURnZ0VQQURDQ0FRb0NnZ0VCQU9OamZ6VzhGYWR0bU4zMFZVb0tHWis5UllDOFpwTzZ0eFBWb3M4Y3RvODQKcFVnV1BENTUxYmdZOUFLSS9lRzVYZTR5ZnpObm1kc29ZTFhiY29qcWVwb2ZUMXVBOTlDR0NUYjV4OXdKY0tSZgp4T0RjZVJEMGwrbWQ2ODllQnlKQXNMQ0ZqdEc1SWxrWE1SYmc4SVlyb3RlajlPMWowQWxWdWhuWWxyUjd5ZXM5CjRhMmF4RmkzbmgrWlVLcGw3bmpiMWFYNExPbFAwMUVQSG1qZEhwWE0ya0QvemVib3MzNkNOc2x5OTRNSHhCOXgKb1JiS0V3c2N3YVZ0d1dPZTIvTmxlY2U0NnE2Yk0xbzhyTk9LRDhrdDFhRllKc2l5WHZnbFdRSGlBV2lkdWhHZwpKRTVSc3hhSjRGcFYvOERWMlFmRWYwU2xnUEluTC9jWFhSc3FZSWtZSlFrQ0F3RUFBYU5XTUZRd0RnWURWUjBQCkFRSC9CQVFEQWdXZ01CTUdBMVVkSlFRTU1Bb0dDQ3NHQVFVRkJ3TUNNQXdHQTFVZEV3RUIvd1FDTUFBd0h3WUQKVlIwakJCZ3dGb0FVZHVwdUs5WENVNXZIM0ZCMlJ1TVRMQ3Y1NnY0d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQgpBQ0NKd1g1V28xZCtzVVFBTDBQR2FHOExyUCtoYjRYMG1DYVZhSFBmcWZUZjlYVE5aUmZsYW9neEh1c3VBcnNsCmwyTnBKeW1RNmlCU1hpV2VWSythZllheFVFaXdBMXdicUVDQWRBOGxhMU9IYlVIZS9VejRYZ0JaclpYKy8xWWwKVWR5YWl0NDEvaVZBelBYdHJHN2lKOHJ2ZDV0MDBFdkNwZ2tZQ3FjY1k3WTJndm5qczV2T1R2QlRwZ0kwd1VGZwpnTUZ4Tk5OMHJRd2VoR2xuYWNSTnRISGFTM0sweHByMXY2WmIwc1RGcyszeXYwTGNBMStxZy9TRmZXWS9LNlorClVic2lvRHJwV3BseXk1YjgvRzBEVW9oMWk3WXE5M2JCMDhObE41SDl5WlcySnlDd1IvMUhOdk5JS1N4Q3ZKVUIKanlwQTlJQlZRUTRNRUhkdStlVnowQmM9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBNDJOL05id1ZwMjJZM2ZSVlNnb1puNzFGZ0x4bWs3cTNFOVdpenh5Mmp6aWxTQlk4ClBublZ1QmowQW9qOTRibGQ3akovTTJlWjJ5aGd0ZHR5aU9wNm1oOVBXNEQzMElZSk52bkgzQWx3cEYvRTROeDUKRVBTWDZaM3J6MTRISWtDd3NJV08wYmtpV1JjeEZ1RHdoaXVpMTZQMDdXUFFDVlc2R2RpV3RIdko2ejNoclpyRQpXTGVlSDVsUXFtWHVlTnZWcGZnczZVL1RVUThlYU4wZWxjemFRUC9ONXVpemZvSTJ5WEwzZ3dmRUgzR2hGc29UCkN4ekJwVzNCWTU3YjgyVjV4N2pxcnBzeldqeXMwNG9QeVMzVm9WZ215TEplK0NWWkFlSUJhSjI2RWFBa1RsR3oKRm9uZ1dsWC93TlhaQjhSL1JLV0E4aWN2OXhkZEd5cGdpUmdsQ1FJREFRQUJBb0lCQVFDc0pGRFRqellkY0QwQQpHczdPcEdMTnFXNEtqWlppVkVIeEJCU2pFcXVxTlVuN0RzcEF5ZDlmNVpRa3J5ejBTMjZ1dXcvTkRLdFBYSHdLCmNMMStwWFIzWlNpZ3J6dnNZdXhxOENHN2xISHdIb2hmYXNsRWFzYnVseDFEK1gwUkUwUXYvb3dtZlM5aG5zc00KOVBGaHdYc2dJUUYxRGRFYW9BbXBNMnl6NmRycytKZy9LUUFFa2RaYkUyT2JNTCtHeFV6bmF4MUF4a3R2SkxkUQpIZzFLU1h3UC9MOHhOZ0dWVERSY3pYUEUrLzMwQkp1akRqQzVENjF5eGdCVzdjWUVTTStjSHlZUzNIV2EvVWI0Clp3Ym5NMyt6RnN3TGdXekpCZXJEWnJZRDhmcGZKbjNLZFJQRncrVVhVMitkSXlvc05OR28rOTI3eXZUVW1jUGYKTWhXSnVzQUJBb0dCQVB6NkdZOGxkbTZHRkFhWHFrMXRBK25xYzQrbnpTSmdUalFNcktCYmg4M1o4cWxlMEtZUQpCQjcvRTZSaXFXTk5temhSc1J4NGtCZkZaTFBSTkFPRVZKZW5JNnF1UVdjTytOQisxbHBPRjRYMWFzK1kyWEdLClNyZkg0YnB0UDNYN3N2dmdWaUZiRzRVU1JLbGJUcUpzazJVRG9JbkU1OWVjQkpEMnVZS2Q1bVFCQW9HQkFPWWIKSGpuT0ZOUEZSRjFqZTRVRlFWYUJvb213dlB4ajg0VlE0ZUdDNm9zTVdQeDE1NDVtYkhWWDZ2cFN6VmM3ajBScQpXNGxYdHdpVGp5UWE5TkNFczdmSDV2WEs1d2ZVdlRoZEFtTy9jKy9adG9HZXJFc2Ixd29tY0ExM2poM2JGRVpoCitmZnhzWkJyNTd2ckpLbitZUncvSjJ0UTZKMk5LWHBGZnJqdkdxRUpBb0dBRXU4QkJMYXdFM3VUZWg3Vnp5K3UKa0U1TTBkNmtPc05zZHZiUDRMeVpBRzRrZkVxdFlSQm56bzRXd1VIbEhacU1XSDI4dkwzRlF4SXlCRWRQRmtoTQpNSUdBNk9CYjRzTzdHWmUwb1ZPZzdSUytKc1Z3Mk0rWjRnRml3cG8wbXJiNDRXTWI3eWtyZVIweDZGNytGcWY3CnJCN0dZQ2xObE5TSGZ2WUlVbDlSQkFFQ2dZRUFveFJQREtxNGFnbmgxTW4vclp1MjNjZE1XWWRQdVJSaGIzZU8KVHRRSXcvbEJTOU9JQTQwbGl0aC9hVitydGdvNUZFVElrUU1BYm15ZHd2bnp6YUJ1K200TGl1RjljVGhkem9kawpmU3NmMExvY3RhcXQ4eUZNK3gyWXhvS1h6eU1JTUlXWnNoYXlRR2VwT2E2Q01wUmRZTGFGaW5JeUdnOEVlV3F0ClVBWHRlbmtDZ1lBdFBNa1JpUkNQMDd1dzBHbGVhcXRqL3NGSklQNzFNaWxzNnF5M2Y5ZEJBTUdoSFlWYmRMODIKTlNlWjJUMlNncWpSNXRyZk1ZSkcvdnEwRzd5cHNoaVUwQWY3ZHpCdVdXUkZKOXMvdWZlalBRZDFxVHQrc2dQSwpOZVFoVjBrOWwwRmE0NEFDYWZBdUxMdmpvYVZlL2hLb01VUytOaHJmVVdFVXdYRkJOMnZQUFE9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=
[root@aminglinux01 ~]#
可以看到,配置文件里的证书内容和Kube-controller-namager一样。
5)Kubelet
Kubelet用的也是kubeconfig
[root@aminglinux01 ~]# cat /etc/kubernetes/kubelet.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJME1EY3dOREU0TkRnME5sb1hEVE0wTURjd01qRTRORGcwTmxvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTUhnCjBFOHRLVlcvdXJ4QW5vc2VNZ2xKeGlQOGNsMnR6RjdQaTZjVldwdWdkUSt2OWh3Tk9tRWhvRTY0NXJMNG1LaFcKd09lOGF5RjFsUWFPN0pZcE5qQ3FVeURZbjhncW82M3JIbU5NNDh5OFpuQkYwNmVhandSbFZLL0toMCtzVk9UdgpsWGtYclljSkxGMmdwc2YwNGxxQUkzN1BNd1RPZVlUTC9lTHMva0dqWGZ0L2EzenJ0R0xJY09lZmkzWHRhSU84CmFhNnpKMnd4eEwrb0VtdXhyZHE3Qi91SEVMS2U0NGdvNUJmZ3FudEFLeTBtK1pVTlB5VE9OSEtIZnAzN3pEdXoKOWFNeVQwWUJxSFFxRjRZMzlOUFRrRnE4L0hVOGFxcFl0SGFtMWQ0czdTTDVhTDVHUVcvbmlEYS9nMGQrM3dseQpRaUlDbGpWN2k3eUNDaTM2c3ZrQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZIYnFiaXZWd2xPYng5eFFka2JqRXl3citlcitNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQk9xZ0VSU3AwMmxYYVRia1U3VgpITHp4SWdyUXBXdm1oMWJFMGRpTTdhaDgzMW9YRm9PclBqWkZDdnU5eW5KVUxQVkQwa0ZoaCtBanlIT09JR0FECjE2MlJWZFRmaXpZOXZjcWZsdGg3ZHVGcCtSYnc1UFU2QU5Jc0lOWEVnQzdPNGxuRTQrRWY3bmlBMkt5SGlpUk4KSzhjR3lPc3crVmlRY2JPMVdwRmlKdlkxNmhmTkdueHBvdjhIRjNFRFYrS3VqMm9vc2hyTDR2bGVFM3BxRlYxUQp1enlnZFBVdTdWd00zZ3Q1czNTbmVrTkJ5NFA5T1M3cGo0d0RobkdJb3dLT01maXR6RUdpSEcxNm5BbytwZUdCCnA5T3ZHVWJvK1BJc1NzL1JFeUcyU2grTUFCaHo0dmhDSEJSY2hUdXZjeXJIdHVxdVF3U1VQeDZhWU96Q2FXalMKeVJ3PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server: https://192.168.100.151:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: system:node:aminglinux01
name: system:node:aminglinux01@kubernetes
current-context: system:node:aminglinux01@kubernetes
kind: Config
preferences: {}
users:
- name: system:node:aminglinux01
user:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
[root@aminglinux01 ~]#
这个certificate-authority-data对应的数据和上面几个组件一样。而最下面的user配置段里有client-certificate和client-key,为kubelet作为客户端时用的CA证书。
2. 续签证书
CA证书是有时效性的,如果过期了会影响到业务。如何查看证书何时到期呢?
openssl x509 -noout -dates -in /etc/kubernetes/pki/apiserver.crt
[root@aminglinux01 ~]# openssl x509 -noout -dates -in /etc/kubernetes/pki/apiserver.crt
notBefore=Jul 4 18:48:46 2024 GMT
notAfter=Jul 4 18:48:46 2025 GMT
[root@aminglinux01 ~]#
可见证书有效期为1年。
如果你的Kubernetes集群是由kubeadm搭建,那么还有一种方法,使用kubeadm查看整个集群所有证书有效期:
[root@aminglinux01 ~]# kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Jul 04, 2025 18:48 UTC 330d ca no
apiserver Jul 04, 2025 18:48 UTC 330d ca no
apiserver-etcd-client Jul 04, 2025 18:48 UTC 330d etcd-ca no
apiserver-kubelet-client Jul 04, 2025 18:48 UTC 330d ca no
controller-manager.conf Jul 04, 2025 18:48 UTC 330d ca no
etcd-healthcheck-client Jul 04, 2025 18:48 UTC 330d etcd-ca no
etcd-peer Jul 04, 2025 18:48 UTC 330d etcd-ca no
etcd-server Jul 04, 2025 18:48 UTC 330d etcd-ca no
front-proxy-client Jul 04, 2025 18:48 UTC 330d front-proxy-ca no
scheduler.conf Jul 04, 2025 18:48 UTC 330d ca noCERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Jul 02, 2034 18:48 UTC 9y no
etcd-ca Jul 02, 2034 18:48 UTC 9y no
front-proxy-ca Jul 02, 2034 18:48 UTC 9y no
[root@aminglinux01 ~]#
如果到期,使用kubeadm可以续签证书,方法是:
kubeadm certs renew all
看输出,最后面有一句提醒,You must restart the kube-apiserver, kube-controller-manager, kubescheduler
and etcd, so that they can use the new certificates.
需要重启这些服务:kube-apiserver, kube-controller-manager, kube-scheduler and etcd
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Aug 08, 2025 17:25 UTC 364d ca no
apiserver Aug 08, 2025 17:25 UTC 364d ca no
apiserver-etcd-client Aug 08, 2025 17:25 UTC 364d etcd-ca no
apiserver-kubelet-client Aug 08, 2025 17:25 UTC 364d ca no
controller-manager.conf Aug 08, 2025 17:25 UTC 364d ca no
etcd-healthcheck-client Aug 08, 2025 17:25 UTC 364d etcd-ca no
etcd-peer Aug 08, 2025 17:25 UTC 364d etcd-ca no
etcd-server Aug 08, 2025 17:25 UTC 364d etcd-ca no
front-proxy-client Aug 08, 2025 17:25 UTC 364d front-proxy-ca no
scheduler.conf Aug 08, 2025 17:25 UTC 364d ca no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Jul 02, 2034 18:48 UTC 9y no
etcd-ca Jul 02, 2034 18:48 UTC 9y no
front-proxy-ca Jul 02, 2034 18:48 UTC 9y no
[root@aminglinux01 ~]#
五、Kubernetes集群版本升级
1.为什么要升级
① 为了使用新功能
② 当前版本存在bug
③ 当前版本存在安全漏洞
2.注意事项:
① 不支持跨版本升级(这个跨版本指的是主要版本和次要版本,比如1.24.2,其中1为主要版本,24为次要版本,2为补丁版本)示例:
1.20.2 --> 1.21.4 支持
1.20.2 --> 1.22.3 不支持
1.25.0 --> 1.25.4 支持
② 升级前做备份
③ 升级前拿测试环境做演练
3.升级流程
① Node层面
先升级Master k8s01(如果有多个Master,需要一台一台升级) --> 再升级Worker节点k8s02和k8s03
② 软件层面
升级kubeadm --> 节点执行drain操作 --> 升级各组件(apiserver, coredns, kube-proxy, controllermanager,scheduler等)--> 取消drain --> 升级kubelet和kubectl
官方升级文档: https://kubernetes.io/zh-cn/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
4.升级步骤
示例: 1.26.2 --> 1.27.2
① 升级Master
查看最新版本
yum list --showduplicates kubeadm
升级kubeadm
yum install kubeadm-1.27.2-0 ##需要指定版本号
[root@aminglinux01 ~]# yum install kubeadm-1.27.2-0
Last metadata expiration check: 1:22:49 ago on Fri 09 Aug 2024 12:11:30 AM CST.
Dependencies resolved.
=================================================================================================================================================
Package Architecture Version Repository Size
=================================================================================================================================================
Upgrading:
kubeadm x86_64 1.27.2-0 kubernetes 11 M
Transaction Summary
=================================================================================================================================================
Upgrade 1 Package
Total download size: 11 M
Is this ok [y/N]: y
Downloading Packages:
fd2bd2879d4a52a4a57ebfd7ea15f5617ce14944a3a81aa8285d4b34cd4c565f-kubeadm-1.27.2-0.x86_64.rpm 242 kB/s | 11 MB 00:44
-------------------------------------------------------------------------------------------------------------------------------------------------
Total 242 kB/s | 11 MB 00:44
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Running scriptlet: kubeadm-1.27.2-0.x86_64 1/1
Upgrading : kubeadm-1.27.2-0.x86_64 1/2
Cleanup : kubeadm-1.26.2-0.x86_64 2/2
Running scriptlet: kubeadm-1.26.2-0.x86_64 2/2
Verifying : kubeadm-1.27.2-0.x86_64 1/2
Verifying : kubeadm-1.26.2-0.x86_64 2/2
Upgraded:
kubeadm-1.27.2-0.x86_64
Complete!
[root@aminglinux01 ~]#
驱逐Master上的Pod
[root@aminglinux01 ~]# kubectl drain aminglinux01 --ignore-daemonsets
node/aminglinux01 cordoned
Warning: ignoring DaemonSet-managed Pods: default/node-exporter-h4ntw, kube-system/calico-node-9ltgn, kube-system/kube-proxy-bklnw, metallb-system/speaker-lzst8
evicting pod kube-system/coredns-567c556887-vgsth
evicting pod kube-system/calico-kube-controllers-57b57c56f-h2znw
evicting pod kube-system/coredns-567c556887-pqv8h
pod/calico-kube-controllers-57b57c56f-h2znw evicted
pod/coredns-567c556887-vgsth evicted
pod/coredns-567c556887-pqv8h evicted
node/aminglinux01 drained
[root@aminglinux01 ~]#
查看集群是否能够升级
kubeadm upgrade plan
[root@aminglinux01 ~]# kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.26.2
[upgrade/versions] kubeadm version: v1.27.2
I0809 01:36:46.354043 1788938 version.go:256] remote version is much newer: v1.30.3; falling back to: stable-1.27
[upgrade/versions] Target version: v1.27.16
[upgrade/versions] Latest version in the v1.26 series: v1.26.15
W0809 01:36:51.822287 1788938 compute.go:307] [upgrade/versions] could not find officially supported version of etcd for Kubernetes v1.27.16, falling back to the nearest etcd version (3.5.7-0)
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT TARGET
kubelet 3 x v1.26.2 v1.26.15
Upgrade to the latest version in the v1.26 series:
COMPONENT CURRENT TARGET
kube-apiserver v1.26.2 v1.26.15
kube-controller-manager v1.26.2 v1.26.15
kube-scheduler v1.26.2 v1.26.15
kube-proxy v1.26.2 v1.26.15
CoreDNS v1.9.3 v1.10.1
etcd 3.5.6-0 3.5.7-0
You can now apply the upgrade by executing the following command:
kubeadm upgrade apply v1.26.15
_____________________________________________________________________
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT TARGET
kubelet 3 x v1.26.2 v1.27.16
Upgrade to the latest stable version:
COMPONENT CURRENT TARGET
kube-apiserver v1.26.2 v1.27.16
kube-controller-manager v1.26.2 v1.27.16
kube-scheduler v1.26.2 v1.27.16
kube-proxy v1.26.2 v1.27.16
CoreDNS v1.9.3 v1.10.1
etcd 3.5.6-0 3.5.7-0
You can now apply the upgrade by executing the following command:
kubeadm upgrade apply v1.27.16
Note: Before you can perform this upgrade, you have to update kubeadm to v1.27.16.
_____________________________________________________________________
The table below shows the current state of component configs as understood by this version of kubeadm.
Configs that have a "yes" mark in the "MANUAL UPGRADE REQUIRED" column require manual config upgrade or
resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually
upgrade to is denoted in the "PREFERRED VERSION" column.
API GROUP CURRENT VERSION PREFERRED VERSION MANUAL UPGRADE REQUIRED
kubeproxy.config.k8s.io v1alpha1 v1alpha1 no
kubelet.config.k8s.io v1beta1 v1beta1 no
_____________________________________________________________________
[root@aminglinux01 ~]#
执行升级
kubeadm upgrade apply v1.27.2
[root@aminglinux01 ~]# kubeadm upgrade apply v1.27.2
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.27.2"
[upgrade/versions] Cluster version: v1.26.2
[upgrade/versions] kubeadm version: v1.27.2
[upgrade] Are you sure you want to proceed? [y/N]: y
[upgrade/prepull] Pulling images required for setting up a Kubernetes cluster
[upgrade/prepull] This might take a minute or two, depending on the speed of your internet connection
[upgrade/prepull] You can also perform this action in beforehand using 'kubeadm config images pull'
W0809 01:38:00.168288 1789565 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.2, falling back to the nearest etcd version (3.5.7-0)
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.27.2" (timeout: 5m0s)...
[upgrade/etcd] Upgrading to TLS for etcd
W0809 01:38:29.786288 1789565 staticpods.go:305] [upgrade/etcd] could not find officially supported version of etcd for Kubernetes v1.27.2, falling back to the nearest etcd version (3.5.7-0)
W0809 01:38:29.806485 1789565 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.2, falling back to the nearest etcd version (3.5.7-0)
[upgrade/staticpods] Preparing for "etcd" upgrade
[upgrade/staticpods] Renewing etcd-server certificate
[upgrade/staticpods] Renewing etcd-peer certificate
[upgrade/staticpods] Renewing etcd-healthcheck-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/etcd.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2024-08-09-01-38-25/etcd.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=etcd
[upgrade/staticpods] Component "etcd" upgraded successfully!
[upgrade/etcd] Waiting for etcd to become available
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests2371868498"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2024-08-09-01-38-25/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2024-08-09-01-38-25/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2024-08-09-01-38-25/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upgrade] Backing up kubelet config file to /etc/kubernetes/tmp/kubeadm-kubelet-config3142449917/config.yaml
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.27.2". Enjoy!
[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
[root@aminglinux01 ~]#
升级kubelet和kubectl
yum install -y kubelet-1.27.2-0 kubectl-1.27.2-0
[root@aminglinux01 ~]# yum install -y kubelet-1.27.2-0 kubectl-1.27.2-0
Last metadata expiration check: 1:41:58 ago on Fri 09 Aug 2024 12:11:30 AM CST.
Dependencies resolved.
=================================================================================================================================================
Package Architecture Version Repository Size
=================================================================================================================================================
Upgrading:
kubectl x86_64 1.27.2-0 kubernetes 11 M
kubelet x86_64 1.27.2-0 kubernetes 20 M
Transaction Summary
=================================================================================================================================================
Upgrade 2 Packages
Total download size: 31 M
Downloading Packages:
(1/2): e9c9fbb0572c93dfed395364783d069d1ecede325cec5de46324139b55567e75-kubectl-1.27.2-0.x86_64.rpm 243 kB/s | 11 MB 00:45
(2/2): 9b9120983b1691b7f47b51f43a276207ab83128945a28f3dfa28adc00bba7df4-kubelet-1.27.2-0.x86_64.rpm 246 kB/s | 20 MB 01:23
-------------------------------------------------------------------------------------------------------------------------------------------------
Total 380 kB/s | 31 MB 01:23
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Running scriptlet: kubelet-1.27.2-0.x86_64 1/1
Upgrading : kubelet-1.27.2-0.x86_64 1/4
Upgrading : kubectl-1.27.2-0.x86_64 2/4
Cleanup : kubectl-1.26.2-0.x86_64 3/4
Cleanup : kubelet-1.26.2-0.x86_64 4/4
Running scriptlet: kubelet-1.26.2-0.x86_64 4/4
Verifying : kubectl-1.27.2-0.x86_64 1/4
Verifying : kubectl-1.26.2-0.x86_64 2/4
Verifying : kubelet-1.27.2-0.x86_64 3/4
Verifying : kubelet-1.26.2-0.x86_64 4/4
Upgraded:
kubectl-1.27.2-0.x86_64 kubelet-1.27.2-0.x86_64
Complete!
[root@aminglinux01 ~]#
重启kubelet
systemctl daemon-reload
systemctl restart kubelet
恢复调度,上线
kubectl uncordon aminglinux01
[root@aminglinux01 ~]# kubectl uncordon aminglinux01
node/aminglinux01 uncordoned
[root@aminglinux01 ~]#
② 升级Work第一个节点
升级kubeadm(aminglinux02上执行)
yum install kubeadm-1.27.2-0 ##需要指定版本号
[root@aminglinux02 ~]# yum install kubeadm-1.27.2-0
Last metadata expiration check: 4:23:28 ago on Thu 08 Aug 2024 09:35:04 PM CST.
Dependencies resolved.
============================================================================================================================================================================
Package Architecture Version Repository Size
============================================================================================================================================================================
Upgrading:
kubeadm x86_64 1.27.2-0 kubernetes 11 M
Transaction Summary
============================================================================================================================================================================
Upgrade 1 Package
Total download size: 11 M
Is this ok [y/N]: y
Downloading Packages:
fd2bd2879d4a52a4a57ebfd7ea15f5617ce14944a3a81aa8285d4b34cd4c565f-kubeadm-1.27.2-0.x86_64.rpm 300 kB/s | 11 MB 00:35 A
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 300 kB/s | 11 MB 00:35
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Running scriptlet: kubeadm-1.27.2-0.x86_64 1/1
Upgrading : kubeadm-1.27.2-0.x86_64 1/2
Cleanup : kubeadm-1.26.2-0.x86_64 2/2
Running scriptlet: kubeadm-1.26.2-0.x86_64 2/2
Verifying : kubeadm-1.27.2-0.x86_64 1/2
Verifying : kubeadm-1.26.2-0.x86_64 2/2
Upgraded:
kubeadm-1.27.2-0.x86_64
Complete!
[root@aminglinux02 ~]#
驱逐aminglinux02上的Pod(aminglinux01上执行)
kubectl drain aminglinux02 --ignore-daemonsets --delete-emptydir-data
[root@aminglinux01 ~]# kubectl drain aminglinux02 --ignore-daemonsets --delete-emptydir-data --force
node/aminglinux02 already cordoned
Warning: deleting Pods that declare no controller: default/ngnix, default/pod-demo; ignoring DaemonSet-managed Pods: default/node-exporter-9cn2c, kube-system/calico-node-m9rq9, kube-system/kube-proxy-cwmb6, metallb-system/speaker-bq7zd
evicting pod yeyunyi/lucky1-5cf7f459cf-jmvnr
evicting pod default/kube-state-metrics-75778cdfff-2vdkl
evicting pod default/myharbor-core-b9d48ccdd-v9jdz
evicting pod default/myharbor-portal-ff7fd4949-lj6jw
evicting pod default/myharbor-registry-5b59458d9-4j79b
evicting pod default/lucky-6cdcf8b9d4-t5r66
evicting pod default/ngnix
evicting pod kube-system/coredns-65dcc469f7-ngd26
evicting pod default/pod-demo
evicting pod kube-system/metrics-server-76467d945-9jtxx
evicting pod default/prometheus-alertmanager-0
evicting pod default/prometheus-consul-0
evicting pod yeyunyi/lucky-6cdcf8b9d4-8fql2
evicting pod default/prometheus-consul-2
evicting pod default/prometheus-nginx-exporter-bbf5d8b8b-s8hvl
evicting pod yeyunyi/lucky-6cdcf8b9d4-mxm97
evicting pod default/redis-sts-0
evicting pod default/grafana-784469b9b9-4htvz
evicting pod yeyunyi/lucky1-5cf7f459cf-dv5fz
evicting pod yeyunyi/lucky-6cdcf8b9d4-g8p26
evicting pod yeyunyi/lucky1-5cf7f459cf-7t2gj
I0809 02:00:57.909576 1803841 request.go:696] Waited for 1.000351519s due to client-side throttling, not priority and fairness, request: POST:https://192.168.100.151:6443/api/v1/namespaces/kube-system/pods/metrics-server-76467d945-9jtxx/eviction
error when evicting pods/"prometheus-consul-2" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod default/prometheus-consul-2
error when evicting pods/"prometheus-consul-2" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
pod/prometheus-nginx-exporter-bbf5d8b8b-s8hvl evicted
pod/lucky1-5cf7f459cf-jmvnr evicted
pod/metrics-server-76467d945-9jtxx evicted
pod/kube-state-metrics-75778cdfff-2vdkl evicted
I0809 02:01:08.070088 1803841 request.go:696] Waited for 3.966077621s due to client-side throttling, not priority and fairness, request: GET:https://192.168.100.151:6443/api/v1/namespaces/default/pods/myharbor-registry-5b59458d9-4j79b
pod/grafana-784469b9b9-4htvz evicted
evicting pod default/prometheus-consul-2
error when evicting pods/"prometheus-consul-2" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
pod/lucky-6cdcf8b9d4-8fql2 evicted
pod/lucky1-5cf7f459cf-7t2gj evicted
pod/lucky-6cdcf8b9d4-mxm97 evicted
pod/lucky-6cdcf8b9d4-t5r66 evicted
pod/myharbor-portal-ff7fd4949-lj6jw evicted
pod/lucky1-5cf7f459cf-dv5fz evicted
pod/pod-demo evicted
pod/redis-sts-0 evicted
pod/myharbor-registry-5b59458d9-4j79b evicted
pod/ngnix evicted
pod/coredns-65dcc469f7-ngd26 evicted
evicting pod default/prometheus-consul-2
error when evicting pods/"prometheus-consul-2" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
pod/myharbor-core-b9d48ccdd-v9jdz evicted
pod/lucky-6cdcf8b9d4-g8p26 evicted
pod/prometheus-consul-0 evicted
升级kubelet配置(aminglinux02上执行)
kubeadm upgrade node
[root@aminglinux02 ~]# kubeadm upgrade node
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks
[preflight] Skipping prepull. Not a control plane node.
[upgrade] Skipping phase. Not a control plane node.
[upgrade] Backing up kubelet config file to /etc/kubernetes/tmp/kubeadm-kubelet-config786583419/config.yaml
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade] The configuration for this node was successfully updated!
[upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.
[root@aminglinux02 ~]#
升级kubelet和kubectl(aminglinux02上执行)
yum install -y kubelet-1.27.2-0 kubectl-1.27.2-0
[root@aminglinux02 ~]# yum install -y kubelet-1.27.2-0 kubectl-1.27.2-0
Last metadata expiration check: 4:31:08 ago on Thu 08 Aug 2024 09:35:04 PM CST.
Dependencies resolved.
============================================================================================================================================================================
Package Architecture Version Repository Size
============================================================================================================================================================================
Upgrading:
kubectl x86_64 1.27.2-0 kubernetes 11 M
kubelet x86_64 1.27.2-0 kubernetes 20 M
Transaction Summary
============================================================================================================================================================================
Upgrade 2 Packages
Total download size: 31 M
Downloading Packages:
(1/2): e9c9fbb0572c93dfed395364783d069d1ecede325cec5de46324139b55567e75-kubectl-1.27.2-0.x86_64.rpm 376 kB/s | 11 MB 00:29
(2/2): 9b9120983b1691b7f47b51f43a276207ab83128945a28f3dfa28adc00bba7df4-kubelet-1.27.2-0.x86_64.rpm 403 kB/s | 20 MB 00:50
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 623 kB/s | 31 MB 00:50
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Running scriptlet: kubelet-1.27.2-0.x86_64 1/1
Upgrading : kubelet-1.27.2-0.x86_64 1/4
Upgrading : kubectl-1.27.2-0.x86_64 2/4
Cleanup : kubectl-1.26.2-0.x86_64 3/4
Cleanup : kubelet-1.26.2-0.x86_64 4/4
Running scriptlet: kubelet-1.26.2-0.x86_64 4/4
Verifying : kubectl-1.27.2-0.x86_64 1/4
Verifying : kubectl-1.26.2-0.x86_64 2/4
Verifying : kubelet-1.27.2-0.x86_64 3/4
Verifying : kubelet-1.26.2-0.x86_64 4/4
Upgraded:
kubectl-1.27.2-0.x86_64 kubelet-1.27.2-0.x86_64
Complete!
[root@aminglinux02 ~]#
重启kubelet(aminglinux02上执行)
systemctl daemon-reload
systemctl restart kubelet
恢复调度,上线(aminglinux01上执行)
kubectl uncordon aminglinux02
[root@aminglinux01 ~]# kubectl uncordon aminglinux02
node/aminglinux02 uncordoned
③ 升级Work第二个节点
升级kubeadm(aminglinux03上执行)
yum install kubeadm-1.27.2-0 ##需要指定版本号
[root@aminglinux03 ~]# yum install kubeadm-1.27.2-0
Last metadata expiration check: 2:48:44 ago on Thu 08 Aug 2024 11:31:47 PM CST.
Dependencies resolved.
============================================================================================================================================================================
Package Architecture Version Repository Size
============================================================================================================================================================================
Upgrading:
kubeadm x86_64 1.27.2-0 kubernetes 11 M
Transaction Summary
============================================================================================================================================================================
Upgrade 1 Package
Total download size: 11 M
Is this ok [y/N]: y
Downloading Packages:
fd2bd2879d4a52a4a57ebfd7ea15f5617ce14944a3a81aa8285d4b34cd4c565f-kubeadm-1.27.2-0.x86_64.rpm 399 kB/s | 11 MB 00:27
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 399 kB/s | 11 MB 00:27
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Running scriptlet: kubeadm-1.27.2-0.x86_64 1/1
Upgrading : kubeadm-1.27.2-0.x86_64 1/2
Cleanup : kubeadm-1.26.2-0.x86_64 2/2
Running scriptlet: kubeadm-1.26.2-0.x86_64 2/2
Verifying : kubeadm-1.27.2-0.x86_64 1/2
Verifying : kubeadm-1.26.2-0.x86_64 2/2
Upgraded:
kubeadm-1.27.2-0.x86_64
Complete!
驱逐aminglinux02上的Pod(aminglinux01上执行)
kubectl drain aminglinux03 --ignore-daemonsets --delete-emptydir-data
驱逐过程中,如果遇到问题,可手动删除有问题的pod再执行:
[root@aminglinux03 ~]# kubectl drain aminglinux03 --ignore-daemonsets --delete-emptydir-data
node/aminglinux03 cordoned
error: unable to drain node "aminglinux03" due to error:cannot delete Pods declare no controller (use --force to override): default/nginx, default/pod-demo1, yeyunyi/quota-pod, continuing command...
There are pending nodes to be drained:
aminglinux03
cannot delete Pods declare no controller (use --force to override): default/nginx, default/pod-demo1, yeyunyi/quota-pod
[root@aminglinux03 ~]# kubectl drain aminglinux03 --ignore-daemonsets --delete-emptydir-data
node/aminglinux03 already cordoned
Warning: ignoring DaemonSet-managed Pods: default/node-exporter-wvp2h, kube-system/calico-node-4wnj6, kube-system/kube-proxy-zk49k, metallb-system/speaker-6447c
evicting pod yeyunyi/lucky1-5cf7f459cf-wqcg7
evicting pod default/lucky-6cdcf8b9d4-5s75m
evicting pod default/myharbor-core-b9d48ccdd-vxnmf
evicting pod default/myharbor-jobservice-6f5dbfcc4f-q852z
evicting pod default/myharbor-nginx-65b8c5764d-vz4vn
evicting pod default/myharbor-postgresql-0
evicting pod default/myharbor-redis-master-0
evicting pod default/myharbor-trivy-0
evicting pod default/prometheus-consul-1
evicting pod default/prometheus-nginx-exporter-bbf5d8b8b-t66qq
evicting pod default/prometheus-server-755b857b5-fbw6m
evicting pod default/redis-sts-0
evicting pod default/redis-sts-1
evicting pod kube-system/calico-kube-controllers-57b57c56f-fn7st
evicting pod kube-system/coredns-65dcc469f7-xl8r7
evicting pod kube-system/metrics-server-76467d945-vnjxl
evicting pod kube-system/nfs-client-provisioner-d79cfd7f6-q2n4z
evicting pod metallb-system/controller-759b6c5bb8-hk2wq
evicting pod yeyunyi/lucky-6cdcf8b9d4-465w6
evicting pod yeyunyi/lucky-6cdcf8b9d4-4ws4c
evicting pod yeyunyi/lucky-6cdcf8b9d4-6vc5g
evicting pod yeyunyi/lucky-6cdcf8b9d4-kttgb
evicting pod yeyunyi/lucky-6cdcf8b9d4-p5fwb
evicting pod yeyunyi/lucky1-5cf7f459cf-6p9xz
evicting pod yeyunyi/lucky1-5cf7f459cf-6zdqj
evicting pod yeyunyi/lucky1-5cf7f459cf-bcsl5
evicting pod yeyunyi/lucky1-5cf7f459cf-hh7q9
I0809 02:24:44.420796 1732687 request.go:690] Waited for 1.007525608s due to client-side throttling, not priority and fairness, request: POST:https://192.168.100.151:6443/api/v1/namespaces/kube-system/pods/coredns-65dcc469f7-xl8r7/eviction
pod/controller-759b6c5bb8-hk2wq evicted
pod/myharbor-nginx-65b8c5764d-vz4vn evicted
pod/calico-kube-controllers-57b57c56f-fn7st evicted
pod/lucky1-5cf7f459cf-wqcg7 evicted
pod/myharbor-jobservice-6f5dbfcc4f-q852z evicted
I0809 02:24:54.527812 1732687 request.go:690] Waited for 4.889179594s due to client-side throttling, not priority and fairness, request: GET:https://192.168.100.151:6443/api/v1/namespaces/default/pods/prometheus-nginx-exporter-bbf5d8b8b-t66qq
pod/lucky-6cdcf8b9d4-5s75m evicted
pod/prometheus-nginx-exporter-bbf5d8b8b-t66qq evicted
pod/myharbor-core-b9d48ccdd-vxnmf evicted
pod/lucky1-5cf7f459cf-6zdqj evicted
pod/myharbor-trivy-0 evicted
pod/myharbor-redis-master-0 evicted
pod/redis-sts-0 evicted
pod/coredns-65dcc469f7-xl8r7 evicted
pod/lucky-6cdcf8b9d4-kttgb evicted
pod/lucky-6cdcf8b9d4-p5fwb evicted
pod/lucky1-5cf7f459cf-6p9xz evicted
pod/prometheus-consul-1 evicted
I0809 02:25:04.527850 1732687 request.go:690] Waited for 2.748890807s due to client-side throttling, not priority and fairness, request: GET:https://192.168.100.151:6443/api/v1/namespaces/default/pods/prometheus-server-755b857b5-fbw6m
pod/lucky1-5cf7f459cf-hh7q9 evicted
pod/myharbor-postgresql-0 evicted
pod/lucky-6cdcf8b9d4-6vc5g evicted
pod/lucky1-5cf7f459cf-bcsl5 evicted
pod/nfs-client-provisioner-d79cfd7f6-q2n4z evicted
pod/lucky-6cdcf8b9d4-4ws4c evicted
pod/redis-sts-1 evicted
pod/metrics-server-76467d945-vnjxl evicted
pod/lucky-6cdcf8b9d4-465w6 evicted
pod/prometheus-server-755b857b5-fbw6m evicted
node/aminglinux03 drained
升级kubelet配置(aminglinux03上执行)
kubeadm upgrade node
[root@aminglinux03 ~]# kubeadm upgrade node
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks
[preflight] Skipping prepull. Not a control plane node.
[upgrade] Skipping phase. Not a control plane node.
[upgrade] Backing up kubelet config file to /etc/kubernetes/tmp/kubeadm-kubelet-config2684190082/config.yaml
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade] The configuration for this node was successfully updated!
[upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.
[root@aminglinux03 ~]#
升级kubelet和kubectl(aminglinux03上执行)
yum install -y kubelet-1.27.2-0 kubectl-1.27.2-0
[root@aminglinux03 ~]# yum install -y kubelet-1.27.2-0 kubectl-1.27.2-0
Last metadata expiration check: 2:58:07 ago on Thu 08 Aug 2024 11:31:47 PM CST.
Dependencies resolved.
============================================================================================================================================================================
Package Architecture Version Repository Size
============================================================================================================================================================================
Upgrading:
kubectl x86_64 1.27.2-0 kubernetes 11 M
kubelet x86_64 1.27.2-0 kubernetes 20 M
Transaction Summary
============================================================================================================================================================================
Upgrade 2 Packages
Total download size: 31 M
Downloading Packages:
(1/2): e9c9fbb0572c93dfed395364783d069d1ecede325cec5de46324139b55567e75-kubectl-1.27.2-0.x86_64.rpm 472 kB/s | 11 MB 00:23
(2/2): 9b9120983b1691b7f47b51f43a276207ab83128945a28f3dfa28adc00bba7df4-kubelet-1.27.2-0.x86_64.rpm 478 kB/s | 20 MB 00:42
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 740 kB/s | 31 MB 00:42
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Running scriptlet: kubelet-1.27.2-0.x86_64 1/1
Upgrading : kubelet-1.27.2-0.x86_64 1/4
Upgrading : kubectl-1.27.2-0.x86_64 2/4
Cleanup : kubectl-1.26.2-0.x86_64 3/4
Cleanup : kubelet-1.26.2-0.x86_64 4/4
Running scriptlet: kubelet-1.26.2-0.x86_64 4/4
Verifying : kubectl-1.27.2-0.x86_64 1/4
Verifying : kubectl-1.26.2-0.x86_64 2/4
Verifying : kubelet-1.27.2-0.x86_64 3/4
Verifying : kubelet-1.26.2-0.x86_64 4/4
Upgraded:
kubectl-1.27.2-0.x86_64 kubelet-1.27.2-0.x86_64
Complete!
[root@aminglinux03 ~]#
重启kubelet(aminglinux03上执行)
systemctl daemon-reload
systemctl restart kubelet
恢复调度,上线(aminglinux01上执行)
kubectl uncordon aminglinux03
[root@aminglinux01 ~]# kubectl uncordon aminglinux03
node/aminglinux03 uncordoned
[root@aminglinux01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
aminglinux01 Ready control-plane 34d v1.27.2
aminglinux02 Ready <none> 34d v1.27.2
aminglinux03 Ready <none> 34d v1.27.2
[root@aminglinux01 ~]#
更多推荐
所有评论(0)