k8s生产集群可能存在的问题

  • 基础架构守护进程问题:例如NTP服务关闭
  • 硬件问题:CPU,内存或磁盘损坏
  • 内核问题:内核死锁,文件系统损坏
  • 容器运行时问题:运行时守护进程无响应

当Kubernetes节点发生上述问题,在整个集群中,k8s服务组件并不会感知以上问题,就会导致Pod仍会调度至问题节点,出现业务中断等事件。

node-problem-detector

node-problem-detector可以理解为是一个检测节点的探测器。
为了解决以上问题,社区引入了守护进程node-problem-detector,从各个守护进程收集节点问题,并使它们对上游层可见。
Kubernetes节点诊断的工具,可以探测节点的异常,例如:
• Runtime无响应
• Linux Kernel 无响应
• 网络异常
• 文件描述符异常
• 硬件问题如CPU,内存或者磁盘故障

故障分类

在这里插入图片描述

问题汇报策略

node-problem-detector通过设置NodeCondition(更新节点状态)或者创建Event对象来汇报问题

  • NodeCondition:针对永久性故障/会通过设置NodeCondition来改变节点状态
  • Event:临时故障通过Event来提醒相关对象,比如通知当前节点运行的所有Pod

上手实践

社区项目地址:https://github.com/kubernetes/node-problem-detector

通过helm方式安装

helm repo add deliveryhero https://charts.deliveryhero.io/
helm install deliveryhero/node-problem-detector

Yaml配置清单(原始)

[root@VM-2-29-centos node-problem-detector]# cat npd-ds.yaml 
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-problem-detector-v0.1
  labels:
    k8s-app: node-problem-detector
    version: v0.1
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    matchLabels:
      k8s-app: node-problem-detector
      version: v0.1
      kubernetes.io/cluster-service: "true"
  template:
    metadata:
      labels:
        k8s-app: node-problem-detector
        version: v0.1
        kubernetes.io/cluster-service: "true"
    spec:
      hostNetwork: true
      containers:
        - name: node-problem-detector
          image: cncamp/node-problem-detector:v0.8.10
          securityContext:
            privileged: true
          resources:
            limits:
              cpu: "200m"
              memory: "100Mi"
            requests:
              cpu: "20m"
              memory: "20Mi"
          volumeMounts:
            - name: log
              mountPath: /log
              readOnly: true
      volumes:
        - name: log
          hostPath:
            path: /var/log/

[root@VM-2-29-centos node-problem-detector]# kubectl get node 10.0.2.29 -oyaml
...
  conditions:
  - lastHeartbeatTime: "2022-02-23T07:36:39Z"
    lastTransitionTime: "2022-02-12T08:32:04Z"
    message: Containerd service is up
    reason: ContainerdIsUp
    status: "False"
    type: ContainerdProblem
  - lastHeartbeatTime: "2022-02-23T07:36:39Z"
    lastTransitionTime: "2022-02-12T08:32:04Z"
    message: FD is Under Pressure
    reason: FDUnderPressure
    status: "False"
    type: FDPressure
...

探测到节点异常之后,会把异常信息记录到conditions中,只是做了节点状态更新操作,并不会具备自动处理机制,无法形成闭环。对于此场景,可以使用插件Pod启动NPD、对接监控告警系统、开发自定义控制器等。

故障异常演练

通过将消息注入到 node-problem-detector 正在监视的日志中来尝试运行集群中的 node-problem-detector。例如,假设 node-problem-detector 正在使用KernelMonitor。在k8s节点运行kubectl get events -w,运行sudo sh -c "echo 'kernel: BUG: unable to handle kernel NULL pointer dereference at TESTING' >> /dev/kmsg".

# sh -c "echo 'kernel: BUG: unable to handle kernel NULL pointer dereference at TESTING' >> /dev/kmsg"

可以看到KernelOops事件
在这里插入图片描述

使用插件pod启动NPD

如果你使用的是自定义集群引导解决方案,不需要覆盖默认配置,可以利用插件Pod进一步自动化部署。
创建node-strick-detector.yaml,并在控制平面节点上保存配置到插件Pod的目录 /etc/kubernetes/addons/node-problem-detector

NPD的异常处理行为

  • NPD只负责获取异常事件/并修改node condition,不会对节点状态和调度产生影响
lastHeartbeatTime: "2021-11-06T15:44:46Z" 
lastTransitionTime: "2021 -11-06T15:29:43Z"
message: 'kernel: INFO: task docker:20744 blocked for more than 120 seconds.' 
reason: DockerHung
status: "True"
type: KernelDeadlock
  • 需要自定义控制器,监听NPD汇报的condition, taint node,阻止pod调度至故障节点
  • 问题修复后,重启NPDPod来清理错误事件

常见问题排查

ssh到内网节点
  • 创建一个支持ssh的pod
  • 并通过负载均衡器转发ssh请求
查看日志

针对使用 systemd 拉起的服务

journalctl -afu kubelet -S "2019-08-26 15:00:00"
-u unit,对应的systemd拉起的组件,如kubelet
-f follow,跟踪最新日志
-a show all,现实所有日之列
-S since,从某一时间开始 -S "2019-08-26 15:00:00"

对于标准的容器日志

kubectl logs -f -c <containername> <podname>
kubectl logs -f --all-containers <podname>
kubectl logs -f -c <podname> --previous

如果容器日志被 shell 转储到文件,则需通过 exec

kubectl exec -it xxx -- tail -f /path/to/log

附生产配置

在这里插入图片描述

[root@VM-2-29-centos node-problem-detector]# kubectl get ds -nti-inf node-problem-detector -oyaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    cpaas.io/creator: admin
    cpaas.io/updated-at: "2022-02-12T07:58:59Z"
    deprecated.daemonset.template.generation: "2"
    meta.helm.sh/release-name: node-problem-detector
    meta.helm.sh/release-namespace: ti-inf
  creationTimestamp: "2022-02-11T19:52:30Z"
  generation: 2
  labels:
    app.kubernetes.io/managed-by: Helm
  managedFields:
  - apiVersion: apps/v1
    manager: Go-http-client
    operation: Update
    time: "2022-02-11T19:52:30Z"
  - apiVersion: apps/v1
    manager: kube-controller-manager
    operation: Update
    time: "2022-02-12T09:20:03Z"
  name: node-problem-detector
  namespace: ti-inf
  resourceVersion: "6040793"
  selfLink: /apis/apps/v1/namespaces/ti-inf/daemonsets/node-problem-detector
  uid: 07b214b9-9279-465b-a900-eae233928a3e
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: node-problem-detector
  template:
    metadata:
      annotations:
        cpaas.io/creator: admin
      creationTimestamp: null
      labels:
        app: node-problem-detector
    spec:
      containers:
      - command:
        - /node-problem-detector
        - --logtostderr
        - --v=3
        - --config.zombie-process-monitor=/config/zombie-process-monitor.json	#僵尸进程监控
        - --config.system-log-monitor=/config/kernel-monitor.json,/config/docker-monitor.json	#系统日志监控
        - --config.custom-plugin-monitor=/config/custom-plugin-fd-pressure.json,/config/network-problem-monitor.json,/config/costom-plugin-thread-pressure.json,/config/systemd-monitor-counter.json,/config/custom-plugin-dockerd-monitor.json,/config/custom-plugin-kubelet-monitor.json,/config/custom-plugin-containerd-monitor.json,/config/custom-plugin-pid-pressure.json	#自定义插件监视器配置文件
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        image: xxxxxxxxx/ti-inf/node-problem-detector:v3.6
        imagePullPolicy: Always
        name: node-problem-detector
        resources: {}
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /run
          mountPropagation: HostToContainer
          name: rundir
        - mountPath: /var/log
          name: log
          readOnly: true
        - mountPath: /dev/kmsg
          name: kmsg
          readOnly: true
        - mountPath: /etc/localtime
          name: localtime
          readOnly: true
        - mountPath: /var/run/dbus
          name: systemddbus
      dnsPolicy: ClusterFirst
      hostPID: true
      imagePullSecrets:
      - name: qcloudregistrykey
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: node-problem-detector
      serviceAccountName: node-problem-detector
      terminationGracePeriodSeconds: 30
      volumes:
      - hostPath:
          path: /run
          type: ""
        name: rundir
      - hostPath:
          path: /var/log/
          type: ""
        name: log
      - hostPath:
          path: /dev/kmsg
          type: ""
        name: kmsg
      - hostPath:
          path: /etc/localtime
          type: ""
        name: localtime
      - hostPath:
          path: /var/run/dbus
          type: ""
        name: systemddbus
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate

以僵尸进程为例展现:
在这里插入图片描述

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐