介绍 k8s event 数据收集以及事件集监控
node-problem-detector 官网地址:https://github.com/kubernetes/node-problem-detector。kube-eventer 官网地址: https://github.com/AliyunContainerService/kube-eventer。使用 K8s 集群,我们关注业务、容器、集群三个层面稳定性,最基础的依赖是 K8s node
k8s Event监控和报警
目前测试 华为云 cce 集群 自建k8s 集群 以及 aliyun 集群没有问题 (接入了生产环境正在使用)
kube-eventer 官网地址: https://github.com/AliyunContainerService/kube-eventer
node-problem-detector 官网地址: https://github.com/kubernetes/node-problem-detector
开源的node-problem-detector没有podOOMkilled的标识,下面的configmap我已经改了 会有的
1、K8s事件
K8s 是基于状态机的设计,在不同状态之间迁移时会生成事件。正常的状态间转换会生成 Normal 事件,从正常状态转换为异常状态则会生成 Warning 事件。
使用 K8s 集群,我们关注业务、容器、集群三个层面稳定性,最基础的依赖是 K8s node 要稳定。可能影响 pod 运行的节点问题包括:
基础设施:ntp 服务不可用、网络故障等。
硬件:例如 CPU、内存、磁盘故障,使用 IaaS 会大大降低这类问题发生概率。
OS:Kernel deadlock,文件系统损坏等。
Container Runtime:例如 docker engine hang 等
2、Event 架构图
3、搭建 开源组件kube-event & node-problem-detector
3.1创建 node-problem-detector sa(sercviceaccount)
创建 node-problem-detector sa
cat event-account-service.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: node-problem-detector
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: node-problem-detector
rules:
- apiGroups:
- ""
resources:
- nodes
- pods
verbs:
- get
- list
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- ""
- events.k8s.io
resources:
- events
verbs:
- create
- patch
- update
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: npd-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: node-problem-detector
subjects:
- kind: ServiceAccount
name: node-problem-detector
namespace: kube-system
3.2创建 node-problem-detector confimap
cat event_configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: ack-node-problem-detector-config
namespace: kube-system
data:
check_docker_offline.sh: |
#!/bin/bash
## release 20220119
## 1. dockerd only.
## 2. containerd only, without dockerd.
## 3. containerd wraped by dockerd.
OK=0
NONOK=1
UNKNOWN=2
# check docker offline
# check dockerd containerd service exist
systemctl list-units --type=service -a | grep -E -q 'docker|containerd'
if [ $? -ne 0 ]; then
echo "node not install docker or containerd"
exit ${UNKNOWN}
fi
# 1. docker runtime. docker.service.
# check docker.service loaded
systemctl status docker | grep -q 'Loaded: loaded'
if [ $? -eq 0 ]; then
echo "node have loaded docker.service"
# if no containerd, docker.service must active.
if [[ `systemctl is-active docker`x == activex ]]; then
# check docker.sock
curl --connect-timeout 20 -m 20 --unix-socket /var/hostrun/docker.sock http://x/containers/json >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "docker ps check error"
exit ${NONOK}
fi
else
echo "node docker.service is loaded, but inactive."
exit ${NONOK}
fi
else
echo "node have no docker service loaded, maybe containerd."
# 2. containerd runtime.
# check containerd status
systemctl list-units --type=service -a | grep -q containerd
if [ $? -eq 0 ]; then
echo "node have containerd service"
CONTAINERD_STATUS=`systemctl is-active containerd`
echo $CONTAINERD_STATUS
if [[ "$CONTAINERD_STATUS"x != "activex" ]]; then
echo "containerd ps check error"
exit ${NONOK}
fi
fi
fi
exit ${OK}
check_inodes.sh: |
#!/bin/bash
# check inode utilization on block device of mounting point /
OK=0
NONOK=1
UNKNOWN=2
iuse=$(df -i | grep "/$" | grep -e [0-9]*% -o | tr -d %)
if [[ $iuse -gt 80 ]]; then
echo "current inode usage is over 80% on node"
exit $NONOK
fi
echo "node has no inode pressure"
exit $OK
check_ntp.sh: |
#!/bin/bash
# NOTE: THIS NTP SERVICE CHECK SCRIPT ASSUME THAT NTP SERVICE IS RUNNING UNDER SYSTEMD.
# THIS IS JUST AN EXAMPLE. YOU CAN WRITE YOUR OWN NODE PROBLEM PLUGIN ON DEMAND.
OK=0
NONOK=1
UNKNOWN=2
# check dockerd containerd service exist
systemctl list-units --type=service -a | grep -E -q 'ntpd|chronyd'
if [ $? -ne 0 ]; then
echo "node not install ntpd or chronyd"
exit ${UNKNOWN}
fi
ntpStatus=1
systemctl status ntpd.service | grep 'Active:' | grep -q running
if [ $? -ne 0 ]; then
ntpStatus=0
fi
chronydStatus=1
systemctl status chronyd.service | grep 'Active:' | grep -q running
if [ $? -ne 0 ]; then
chronydStatus=0
fi
if [ $ntpStatus -eq 0 ] && [ $chronydStatus -eq 0 ]; then
echo "NTP service is not running"
exit $NONOK
fi
echo "NTP service is running"
exit $OK
check_pid_pressure.sh: |
#!/bin/sh
OK=0
NONOK=1
UNKNOWN=2
pidMax=$(cat /host/proc/sys/kernel/pid_max)
threshold=85
availablePid=$(($pidMax * $threshold / 100))
activePid=$(ls /host/proc/ |grep -e "[0-9]" |wc -l)
if [ $activePid -gt $availablePid ]; then
echo "Total running PIDs: $activePid, greater than $availablePid ($threshold% of pidMax $pidMax)"
exit $NONOK
fi
echo "Has sufficient PID available"
exit $OK
check_systemd.sh: |
#!/bin/bash
OK=0
NONOK=1
UNKNOWN=2
systemctl --version
if [ $? -ne 0 ]; then
exit $UNKNOWN
fi
systemctl get-default
if [ $? -ne 0 ]; then
exit $NONOK
fi
docker-monitor.json: |
{
"plugin": "journald",
"pluginConfig": {
"source": "dockerd"
},
"logPath": "/var/log/journal",
"lookback": "5m",
"bufferSize": 10,
"source": "docker-monitor",
"conditions": [],
"rules": [
{
"type": "temporary",
"reason": "CorruptDockerImage",
"pattern": "Error trying v2 registry: failed to register layer: rename /var/lib/docker/image/(.+) /var/lib/docker/image/(.+): directory not empty.*"
}
]
}
docker-offline-monitor.json: |
{
"plugin": "custom",
"pluginConfig": {
"invoke_interval": "30s",
"timeout": "30s",
"max_output_length": 80,
"concurrency": 3
},
"source": "docker-offline-custom-plugin-monitor",
"conditions": [
{
"type": "DockerOffline",
"reason": "DockerDaemonNotOffline",
"message": "docker daemon is ok"
},
{
"type": "RuntimeOffline",
"reason": "RuntimeDaemonNotOffline",
"message": "container runtime daemon is ok"
}
],
"rules": [
{
"type": "permanent",
"condition": "RuntimeOffline",
"reason": "RuntimeDaemonOffline",
"path": "/config/plugin/check_docker_offline.sh",
"timeout": "25s"
},
{
"type": "temporary",
"reason": "RuntimeDaemonOffline",
"path": "/config/plugin/check_docker_offline.sh",
"timeout": "25s"
}
]
}
inodes-problem-monitor.json: |
{
"plugin": "custom",
"pluginConfig": {
"invoke_interval": "120s",
"timeout": "30s",
"max_output_length": 80,
"concurrency": 3
},
"source": "inodes-custom-plugin-monitor",
"conditions": [
{
"type": "InodesPressure",
"reason": "NodeHasNoInodesPressure",
"message": "node has no inodes pressure"
}
],
"rules": [
{
"type": "permanent",
"condition": "InodesPressure",
"reason": "NodeHasInodesPressure",
"message": "inodes usage is over 80% on /dev/sda",
"path": "/config/plugin/check_inodes.sh"
}
]
}
instance_expired_checker.json: |
{
"plugin": "custom",
"pluginConfig": {
"invoke_interval": "60s",
"timeout": "30s",
"max_output_length": 80,
"concurrency": 3,
"enable_message_change_based_condition_update": false
},
"source": "instance_termination_custom_checker",
"conditions": [
{
"type": "InstanceExpired",
"reason": "InstanceNotToBeTerminated",
"message": "instance is not going to be terminated"
}
],
"rules": [
{
"type": "temporary",
"reason": "InstanceToBeTerminated",
"path": "./config/plugin/instance_expired_checker.sh",
"timeout": "30s"
},
{
"type": "permanent",
"condition": "InstanceExpired",
"reason": "InstanceToBeTerminated",
"path": "./config/plugin/instance_expired_checker.sh",
"timeout": "30s"
}
]
}
instance_expired_checker.sh: |
#!/bin/bash
OK=0
NONOK=1
UNKNOWN=2
check_url='http://100.100.100.200/latest/meta-data/instance/spot/terminationtime'
for ((i=1; i<=5; i ++))
do
resp=$(curl --max-time 5 -s $check_url
if [ $? != 0 ]; then
sleep 1
else
echo $resp
date --date $resp +"%s"
if [ $? != 0 ]; then
exit $OK
else
echo "instance is going to be terminated at $resp"
exit $NONOK
fi
fi
done
echo "curl $check_url exe fail after try 5 times"
exit $OK
kernel-monitor.json: |
{
"plugin": "kmsg",
"logPath": "/var/log/journal",
"lookback": "5m",
"bufferSize": 10,
"source": "kernel-monitor",
"conditions": [
{
"type": "KernelDeadlock",
"reason": "KernelHasNoDeadlock",
"message": "kernel has no deadlock"
},
{
"type": "ReadonlyFilesystem",
"reason": "FilesystemIsReadOnly",
"message": "Filesystem is read-only"
}
],
"rules": [
{
"type": "temporary",
"reason": "PodOOMKilling",
"pattern": "(Task in /kubepods/(.+) killed as a result of limit of .*)|(oom-kill:.*,oom_memcg=/kubepods/(.+),task_memcg=.*)"
},
{
"type": "temporary",
"reason": "TaskHung",
"pattern": "task \\S+:\\w+ blocked for more than \\w+ seconds\\."
},
{
"type": "temporary",
"reason": "UnregisterNetDevice",
"pattern": "unregister_netdevice: waiting for \\w+ to become free. Usage count = \\d+"
},
{
"type": "temporary",
"reason": "KernelOops",
"pattern": "BUG: unable to handle kernel NULL pointer dereference at .*"
},
{
"type": "temporary",
"reason": "KernelOops",
"pattern": "divide error: 0000 \\[#\\d+\\] SMP"
},
{
"type": "permanent",
"condition": "KernelDeadlock",
"reason": "AUFSUmountHung",
"pattern": "task umount\\.aufs:\\w+ blocked for more than \\w+ seconds\\."
},
{
"type": "permanent",
"condition": "KernelDeadlock",
"reason": "DockerHung",
"pattern": "task docker:\\w+ blocked for more than \\w+ seconds\\."
},
{
"type": "permanent",
"condition": "ReadonlyFilesystem",
"reason": "FilesystemIsReadOnly",
"pattern": "Remounting filesystem read-only"
},
{
"type": "temporary",
"reason": "TCPMemOverFlow",
"pattern": "TCP: out of memory -- consider tuning tcp_mem"
},
{
"type": "temporary",
"reason": "TCPSkOverFlow",
"pattern": "TCP: too many orphaned sockets"
},
{
"type": "temporary",
"reason": "NFOverFlow",
"pattern": "nf_conntrack: table full, dropping packet"
},
{
"type": "temporary",
"reason": "ARPOverFlow",
"pattern": "\\w+: neighbor table overflow!"
}
]
}
network-problem-monitor.json: |
{
"plugin": "custom",
"pluginConfig": {
"invoke_interval": "60s",
"timeout": "30s",
"max_output_length": 80,
"concurrency": 3
},
"source": "network-problem-custom-plugin-monitor",
"conditions": [
{
"type": "NetworkProblem",
"reason": "NodeHasNoNetworkProblem",
"message": "node has no network problem"
}
],
"rules": [
{
"type": "permanent",
"condition": "NetworkProblem",
"reason": "NodeHasNetworkProblem",
"path": "/config/plugin/network_problem.sh"
}
]
}
network_problem.sh: |
#!/bin/bash
OK=0
NONOK=1
UNKNOWN=2
# Check if network is working properly
# This is just an example, you can customize the script to fit your needs
if [ $? -ne 0 ]; then
echo "Network problem detected"
exit $NONOK
fi
echo "Network is working properly"
exit $OK
ntp-problem-monitor.json: |
{
"plugin": "custom",
"pluginConfig": {
"invoke_interval": "60s",
"timeout": "30s",
"max_output_length": 80,
"concurrency": 3
},
"source": "ntp-problem-custom-plugin-monitor",
"conditions": [
{
"type": "NTPProblem",
"reason": "NodeHasNoNTPProblem",
"message": "node has no NTP problem"
}
],
"rules": [
{
"type": "permanent",
"condition": "NTPProblem",
"reason": "NodeHasNTPProblem",
"path": "/config/plugin/check_ntp.sh"
}
]
}
pid-pressure-problem-monitor.json: |
{
"plugin": "custom",
"pluginConfig": {
"invoke_interval": "60s",
"timeout": "30s",
"max_output_length": 80,
"concurrency": 3
},
"source": "pid-pressure-problem-custom-plugin-monitor",
"conditions": [
{
"type": "PidPressure",
"reason": "NodeHasSufficientPID",
"message": "node has sufficient PID available"
}
],
"rules": [
{
"type": "permanent",
"condition": "PidPressure",
"reason": "RunningOutOfPID",
"path": "/config/plugin/check_pid_pressure.sh"
}
]
}
systemd-check-monitor.json: |
{
"plugin": "custom",
"pluginConfig": {
"invoke_interval": "60s",
"timeout": "30s",
"max_output_length": 80,
"concurrency": 3
},
"source": "systemd-check-custom-plugin-monitor",
"conditions": [
{
"type": "SystemdCheck",
"reason": "NodeHasNoSystemdError",
"message": "node has no systemd error"
}
],
"rules": [
{
"type": "permanent",
"condition": "SystemdCheck",
"reason": "NodeHasSystemdError",
"path": "/config/plugin/check_systemd.sh"
}
]
}
创建 node-problem-detector-daemonset daemonset 组件(这里可以实现Podoomkilling)
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-problem-detector-daemonset
namespace: kube-system
spec:
revisionHistoryLimit: 10
selector:
matchLabels:
app: node-problem-detector
template:
metadata:
labels:
app: node-problem-detector
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: type
operator: NotIn
values:
- virtual-kubelet
containers:
- command:
- /node-problem-detector
- --logtostderr
- --v=3
- --system-log-monitors=/config/kernel-monitor.json,/config/docker-monitor.json
- --custom-plugin-monitors=/config/ntp-problem-monitor.json
- --custom-plugin-monitors=/config/network-problem-monitor.json
- --custom-plugin-monitors=/config/inodes-problem-monitor.json
- --custom-plugin-monitors=/config/pid-pressure-problem-monitor.json
- --custom-plugin-monitors=/config/docker-offline-monitor.json
- --custom-plugin-monitors=/config/instance_expired_checker.json
- --custom-plugin-monitors=/config/systemd-check-monitor.json
env:
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: SYSTEMD_OFFLINE
value: "0"
image: artifactory.momenta.works/docker-hdmap-sre/momenta-npd:v1 #请替换为自己的镜像地址
imagePullPolicy: IfNotPresent
name: ack-node-problem-detector
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: 100m
memory: 200Mi
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/log
name: log
readOnly: true
- mountPath: /dev/kmsg
name: kmsg
readOnly: true
- mountPath: /etc/localtime
name: localtime
readOnly: true
- mountPath: /config
name: config
readOnly: true
- mountPath: /host/proc
name: proc
readOnly: true
- mountPath: /var/run/dbus
name: dbus
readOnly: true
- mountPath: /run/systemd
name: systemd
readOnly: true
- mountPath: /etc/systemd/system
name: system
readOnly: true
- mountPath: /sys/fs/cgroup
name: cgroup
readOnly: true
- mountPath: /var/run
name: dockersock
readOnly: true
dnsPolicy: ClusterFirst
hostNetwork: true
hostPID: true
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
imagePullSecrets:
- name: default-secret
serviceAccount: node-problem-detector
serviceAccountName: node-problem-detector
terminationGracePeriodSeconds: 30
tolerations:
- operator: Exists
volumes:
- hostPath:
path: /var/log/
type: ""
name: log
- hostPath:
path: /dev/kmsg
type: ""
name: kmsg
- hostPath:
path: /etc/localtime
type: ""
name: localtime
- hostPath:
path: /proc
type: ""
name: proc
- hostPath:
path: /var/run/dbus
type: ""
name: dbus
- hostPath:
path: /run/systemd
type: ""
name: systemd
- hostPath:
path: /etc/systemd/system
type: ""
name: system
- hostPath:
path: /sys/fs/cgroup
type: ""
name: cgroup
- hostPath:
path: /var/run
type: DirectoryOrCreate
name: dockersock
- configMap:
defaultMode: 493
name: ack-node-problem-detector-config
items:
- key: kernel-monitor.json
path: kernel-monitor.json
- key: docker-monitor.json
path: docker-monitor.json
- key: ntp-problem-monitor.json
path: ntp-problem-monitor.json
- key: check_ntp.sh
path: plugin/check_ntp.sh
- key: network-problem-monitor.json
path: network-problem-monitor.json
- key: network_problem.sh
path: plugin/network_problem.sh
- key: inodes-problem-monitor.json
path: inodes-problem-monitor.json
- key: check_inodes.sh
path: plugin/check_inodes.sh
- key: pid-pressure-problem-monitor.json
path: pid-pressure-problem-monitor.json
- key: check_pid_pressure.sh
path: plugin/check_pid_pressure.sh
- key: docker-offline-monitor.json
path: docker-offline-monitor.json
- key: check_docker_offline.sh
path: plugin/check_docker_offline.sh
- key: instance_expired_checker.json
path: instance_expired_checker.json
- key: instance_expired_checker.sh
path: plugin/instance_expired_checker.sh
- key: systemd-check-monitor.json
path: systemd-check-monitor.json
- key: check_systemd.sh
path: plugin/check_systemd.sh
name: config
updateStrategy:
rollingUpdate:
#maxSurge: 0
maxUnavailable: 50
type: RollingUpdate
搭建kube-eventer组件 (这里按照官网写的)
我们自己公司搭建的接入了 mysql 和单独服务的webhook 往下截图 数据库没有截图直接接入就可以了
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
name: kube-eventer
name: kube-eventer
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: kube-eventer
template:
metadata:
labels:
app: kube-eventer
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
dnsPolicy: ClusterFirstWithHostNet
serviceAccount: kube-eventer
containers:
- image: registry.aliyuncs.com/acs/kube-eventer:v1.2.7-ca03be0-aliyun
name: kube-eventer
command:
- "/kube-eventer"
- "--source=kubernetes:https://kubernetes.default"
## .e.g,dingtalk sink demo
- --sink=dingtalk:[your_webhook_url]&label=[your_cluster_id]&level=[Normal or Warning(default)]
env:
# If TZ is assigned, set the TZ value as the time zone
- name: TZ
value: "Asia/Shanghai"
volumeMounts:
- name: localtime
mountPath: /etc/localtime
readOnly: true
- name: zoneinfo
mountPath: /usr/share/zoneinfo
readOnly: true
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 500m
memory: 250Mi
volumes:
- name: localtime
hostPath:
path: /etc/localtime
- name: zoneinfo
hostPath:
path: /usr/share/zoneinfo
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kube-eventer
rules:
- apiGroups:
- ""
resources:
- configmaps
- events
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kube-eventer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-eventer
subjects:
- kind: ServiceAccount
name: kube-eventer
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-eventer
namespace: kube-system
如果想接入webhook的配置需要添加confimap配置
apiVersion: v1
kind: ConfigMap
metadata:
name: custom-body
namespace: kube-system
data:
content: '{
"msg_type": "interactive",
"card": {
"config": {
"wide_screen_mode": true,
"enable_forward": true
},
"header": {
"title": {
"tag": "plain_text",
"content": "test"
},
"template": "Red"
},
"elements": [
{
"tag": "div",
"text": {
"tag": "lark_md",
"content": "**EventId:** {{.ObjectMeta.UID }}\n**EventType:** {{ .Type }}\n **Object**: {{ .InvolvedObject.Kind }}/{{ .InvolvedObject.Name }}\n**Namespace**: {{.InvolvedObject.Namespace}}\n**EventKind:** {{ .InvolvedObject.Kind }}\n**EventReason:** {{ .Reason }}\n**EventTime:** {{ .LastTimestamp }}\n**EventMessage:** {{ .Message }}"
}
}
]
}
}'
更多推荐
所有评论(0)