k8s安装prometheus+grafana（第二弹：prometheus-operator）

本来安装prometheus-operator用helm安装就是一句话的事奈何bitnami/prometheus-operator这版本的一个组件8080被应用占了，而且不智能的不切换端口，stable版本又太老提前准备好k8s环境，下载prometheus-operator安装包，我这里使用的0.8.0版本，k8s版本为v1.20.x，其版本与k8s版本有对应关系，请对应下载kube-prom

2018_like菜

5645人浏览 · 2022-02-22 19:19:45

2018_like菜 · 2022-02-22 19:19:45 发布

背景

本来安装prometheus-operator用helm安装就是一句话的事

奈何bitnami/prometheus-operator这版本的一个组件8080被应用占了，而且不智能的不切换端口，stable版本又太老

提前准备好k8s环境，下载prometheus-operator安装包，我这里使用的0.8.0版本，k8s版本为v1.20.x，其版本与k8s版本有对应关系，请对应下载

https://github.com/prometheus-operator/kube-prometheus/releases

kube-prometheus stack	Kubernetes 1.18	Kubernetes 1.19	Kubernetes 1.20	Kubernetes 1.21	Kubernetes 1.22
release-0.6	✗	✔	✗	✗	✗
release-0.7	✗	✔	✔	✗	✗
release-0.8	✗	✗	✔	✔	✗
release-0.9	✗	✗	✗	✔	✔
HEAD	✗	✗	✗	✔	✔

grafana模板下载地址查看

安装

tar zxvf kube-prometheus-0.8.0.tar.gz
cd kube-prometheus-0.8.0/manifests/

翻墙了不需要替换国内镜像

sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' setup/prometheus-operator-deployment.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' prometheus-prometheus.yaml 
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' alertmanager-alertmanager.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' kube-state-metrics-deployment.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' node-exporter-daemonset.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' prometheus-adapter-deployment.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' blackbox-exporter-deployment.yaml
sed -i 's#k8s.gcr.io/kube-state-metrics/kube-state-metrics#bitnami/kube-state-metrics#g' kube-state-metrics-deployment.yaml

修改promethes，alertmanager 端口30093，grafana 30095的service类型为NodePort类型

apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/name: prometheus
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 2.26.0
    prometheus: k8s
  name: prometheus-k8s
  namespace: monitoring
spec:
  type: NodePort #新增
  ports:
  - name: web
    port: 9090
    targetPort: web
    nodePort: 30090  #新增
  selector:
    app: prometheus
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/name: prometheus
    app.kubernetes.io/part-of: kube-prometheus

同理alertmanager-service.yaml、grafana-service.yaml

修改副本数如果需要2个以上的不需要修改

alertmanager-alertmanager.yaml prometheus-adapter-deployment.yaml prometheus-prometheus.yaml

这个地方大于2 改为1

  replicas: 1 #修改
  resources:

修改报警邮箱，其它钉钉企业微信报警可以参考连接

https://blog.51cto.com/shoufu/2537258

https://blog.51cto.com/u_11970509/2904000

alertmanager-secret.yaml 也可以安装后修改

安装前修改没测试

stringData:
alertmanager.yaml: |-

global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.163.com:465'
  smtp_from: 'xxx@163.com'
  smtp_auth_username: 'xxx@163.com'
  smtp_auth_password: 'xxxxZNKXFOWOIWQF'
  smtp_require_tls: false

route:
  group_by: ['alertname']
  group_wait: 30s
  group_interval: 60s
  repeat_interval: 24h #重复发送时间间隔
  receiver: 'mail'

receivers:
- name: 'mail'
  email_configs:
  - to: 'xxx@qq.com'
    send_resolved: true

安装后修改已测试，把上面代码段的保存alertmanager.yaml文件

# 先将之前的 secret 对象删除
$ kubectl delete secret alertmanager-main -n monitoring
secret "alertmanager-main" deleted
$ kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring
secret "alertmanager-main" created

安装prometheus-operator

kubectl apply -f setup/

查看状态，到podRunning再进行下一步

kubectl get pods -n monitoring

kubectl apply -f .

查看状态，到所有podRunning再进行下一步

kubectl get pods -n monitoring

获取prometheus端口

[root manifests]# kubectl get svc -n monitoring  | grep NodePort
alertmanager-main       NodePort    10.101.19.108   <none>        9093:30093/TCP     10m
grafana                 NodePort    10.101.15.228   <none>        3000:30095/TCP     10m
prometheus-k8s          NodePort    10.97.1.58      <none>        9090:30090/TCP     10m

卸载命令

kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup

省略这步骤接下来，使用helm安装custom metrics api server我觉得这步可以省略yaml里有了

helm repo add stable http://mirror.azure.cn/kubernetes/charts
helm repo list
helm install prometheus-adapter stable/prometheus-adapter --namespace kube-system --set prometheus.url=http://prometheus-k8s.monitoring,prometheus.port=9090

查看custom metrics api

[root ~]# kubectl get apiservices -n monitoring | grep metrics
v1beta1.custom.metrics.k8s.io          kube-system/prometheus-adapter   True        11h
v1beta1.metrics.k8s.io                 monitoring/prometheus-adapter    True        15h

grafana模版和测试

ip:30093/#/status查看报警配置

ip:30090查看prometheus

ip:30095查看grafana模版

添加面板13105 全部展示，8919熟悉的主机模版，还有项目自带的模版

测试报警邮件

#!/usr/bin/env bash

alerts_message='[
  {
    "labels": {
       "alertname": "DiskRunningFull",
       "dev": "sda1",
       "instance": "example1",
       "msgtype": "testing"
     },
     "annotations": {
        "info": "The disk sda1 is running full",
        "summary": "please check the instance example1"
      }
  },
  {
    "labels": {
       "alertname": "DiskRunningFull",
       "dev": "sda2",
       "instance": "example1",
       "msgtype": "testing"
     },
     "annotations": {
        "info": "The disk sda2 is running full",
        "summary": "please check the instance example1",
        "runbook": "the following link http://test-url should be clickable"
      }
  }
]'

curl -XPOST -d"$alerts_message" http://127.0.0.1:30093/api/v1/alerts

修改规则只报警主机自定义规则，也可以用项目自带的

1.清除原来服务默认的监控指标数据
查看默认的监控指标规则和清除

这一步会导致项目带的模版好几个看不到图形不用项目的模版和报警问题不大

kubectl get PrometheusRule -n monitoring


kubectl delete PrometheusRule alertmanager-main-rules kube-prometheus-rules kube-state-metrics-rules kubernetes-monitoring-rules node-exporter-rules prometheus-k8s-prometheus-rules prometheus-operator-rules -n monitoring

规则大全参考：

https://awesome-prometheus-alerts.grep.to/rules

查看清理的

2. 自定义监控指标 vi bjgz.yaml 不要在意名字

kubectl apply -f bjgz.yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: host-rules
  namespace: monitoring
spec:
  groups:
    - name: 主机状态-监控告警
      rules:
      - alert: 主机状态
        expr: up == 0
        for: 1m
        labels:
          status: 非常严重
        annotations:
          summary: "{{$labels.instance}}:服务器宕机"
          description: "{{$labels.instance}}:服务器延时超过5分钟"

      - alert: CPU使用情况
        expr: 100-(avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by(instance)* 100) > 60
        for: 1m
        labels:
          status: 一般告警
        annotations:
          summary: "{{$labels.mountpoint}} CPU使用率过高！"
          description: "{{$labels.instance}}:{{$labels.mountpoint }} CPU使用大于60%(目前使用:{{$value}}%)"

      - alert: 内存使用
        expr: 100 -(node_memory_MemTotal_bytes -node_memory_MemFree_bytes+node_memory_Buffers_bytes+node_memory_Cached_bytes ) / node_memory_MemTotal_bytes * 100> 80
        for: 1m
        labels:
          status: 严重告警
        annotations:
          summary: "{{$labels.mountpoint}} 内存使用率过高！"
          description: "{{$labels.instance}}:{{$labels.mountpoint }} 内存使用大于80%(目前使用:{{$value}}%)"
      - alert: IO性能
        expr: 100-(avg(irate(node_disk_io_time_seconds_total[1m])) by(instance)* 100) < 60
        for: 1m
        labels:
          status: 严重告警
        annotations:
          summary: "{{$labels.mountpoint}} 流入磁盘IO使用率过高！"
          description: "{{$labels.instance}}:{{$labels.mountpoint }} 流入磁盘IO大于60%(目前使用:{{$value}})"

      - alert: 网络
        expr: ((sum(rate (node_network_receive_bytes_total{device!~'tap.*|veth.*|br.*|docker.*|virbr*|lo*'}[5m])) by (instance)) / 100) > 102400
        for: 1m
        labels:
          status: 严重告警
        annotations:
          summary: "{{$labels.mountpoint}} 流入网络带宽过高！"
          description: "{{$labels.instance}}:{{$labels.mountpoint }}流入网络带宽持续2分钟高于100M. RX带宽使用率{{$value}}"

      - alert: 网络
        expr: ((sum(rate (node_network_transmit_bytes_total{device!~'tap.*|veth.*|br.*|docker.*|virbr*|lo*'}[5m])) by (instance)) / 100) > 102400
        for: 1m
        labels:
          status: 严重告警
        annotations:
          summary: "{{$labels.mountpoint}} 流出网络带宽过高！"
          description: "{{$labels.instance}}:{{$labels.mountpoint }}流出网络带宽持续2分钟高于100M. RX带宽使用率{{$value}}"

      - alert: TCP会话
        expr: node_netstat_Tcp_CurrEstab > 1000
        for: 1m
        labels:
          status: 严重告警
        annotations:
          summary: "{{$labels.mountpoint}} TCP_ESTABLISHED过高！"
          description: "{{$labels.instance}}:{{$labels.mountpoint }} TCP_ESTABLISHED大于1000%(目前使用:{{$value}}%)"

      - alert: 磁盘容量
        expr: 100-(node_filesystem_free_bytes{fstype=~"ext4|xfs"}/node_filesystem_size_bytes {fstype=~"ext4|xfs"}*100) > 80
        for: 1m
        labels:
          status: 严重告警
        annotations:
          summary: "{{$labels.mountpoint}} 磁盘分区使用率过高！"
          description: "{{$labels.instance}}:{{$labels.mountpoint }} 磁盘分区使用大于80%(目前使用:{{$value}}%)"

等一会面板就能看到了

安装mysql监控，mysql和redis都是k8s外的服务

下面的mysql监控和redis监控加了采集服务后等一会都会在 http://ip:30090/config 也就是prometheus.yml下自动生成配置。

创建采集账号

CREATE USER 'mysqld-exporter' IDENTIFIED BY '123456' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, REPLICATION SLAVE, SELECT ON *.* TO 'mysqld-exporter'

flush privileges;

部署mysqld-exporter创建mysqld-exporter.yaml

- --collect.perf_schema等有些可能报错也可以省略，不省多点日志应该无伤大雅

apiVersion: apps/v1

kind: Deployment

metadata:

  name: mysqld-exporter

  namespace: monitoring

spec:

  replicas: 1

  selector:

    matchLabels:

      app: mysqld-exporter

  template:

    metadata:

      labels:

        app: mysqld-exporter

    spec:

      containers:

      - name: mysqld-exporter

        image: prom/mysqld-exporter

        args:

        #- --collect.info_schema.tables

        #- --collect.info_schema.innodb_tablespaces

        #- --collect.info_schema.innodb_metrics

        #- --collect.global_status

        #- --collect.global_variables

        #- --collect.slave_status

        #- --collect.info_schema.processlist

        #- --collect.perf_schema.tablelocks

        #- --collect.perf_schema.eventsstatements

        #- --collect.perf_schema.eventsstatementssum

        #- --collect.perf_schema.eventswaits

        #- --collect.auto_increment.columns

        #- --collect.binlog_size

        #- --collect.perf_schema.tableiowaits

        #- --collect.perf_schema.indexiowaits

        #- --collect.info_schema.userstats

        #- --collect.info_schema.clientstats

        #- --collect.info_schema.tablestats

        #- --collect.info_schema.schemastats

        #- --collect.perf_schema.file_events

        #- --collect.perf_schema.file_instances

        #- --collect.perf_schema.replication_group_member_stats

        #- --collect.perf_schema.replication_applier_status_by_worker

        #- --collect.slave_hosts

        #- --collect.info_schema.innodb_cmp

        #- --collect.info_schema.innodb_cmpmem

        #- --collect.info_schema.query_response_time

        #- --collect.engine_tokudb_status

        #- --collect.engine_innodb_status

        ports:

        - containerPort: 9104

          protocol: TCP

        env:

        - name: DATA_SOURCE_NAME

          value: "user:password@(hostname:3306)/"

---

apiVersion: v1

kind: Service

metadata:

  name: mysqld-exporter

  namespace: monitoring

  labels:

    app: mysqld-exporter

spec:

  type: ClusterIP

  ports:

  - port: 9104

    protocol: TCP

    name: http-mysql

  selector:

    app: mysqld-exporter

user:password@(hostname:3306) 修改为刚刚创建的账号以及MySQL连接地址

部署采集配置，创建mysqld-exportercaiji.yaml

apiVersion: monitoring.coreos.com/v1

kind: ServiceMonitor

metadata:

  labels:

    k8s-app: mysqld-exporter

  name: mysqld-exporter

  namespace: monitoring

spec:

  endpoints:

  - interval: 30s

    port: http-mysql

    relabelings:

    - sourceLabels:

      - __meta_kubernetes_service_name

      targetLabel: service_name

  jobLabel: mysqld-exporter

  namespaceSelector:

    matchNames:

    - monitoring

  selector:

    matchLabels:

      app: mysqld-exporter

curl http://ip:9104/metrics 没开nodePort ，只能k8s集群里

GRAFANA添加模版11329，14934，14969,11323总有一个合适的

监控规则可以接到上面自定义规则后面，也可以重新写yaml 我这里是接到自定义规则后面

mysql规则有点少啊，可以百度自行添加

      - alert: Mysql_Instance_Reboot
        expr: mysql_global_status_uptime < 180 
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_Instance_Reboot detected"
          description: "{{$labels.instance}}: Mysql_Instance_Reboot in 3 minute (up to now is: {{ $value }} seconds"	
      - alert: Mysql_High_QPS
        expr: rate(mysql_global_status_questions[5m]) > 500 
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_High_QPS detected"
          description: "{{$labels.instance}}: Mysql opreation is more than 500 per second ,(current value is: {{ $value }})"	
      - alert: Mysql_Too_Many_Connections
        expr: rate(mysql_global_status_connections[5m]) > 100
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql Too Many Connections detected"
          description: "{{$labels.instance}}: Mysql Connections is more than 100 per second ,(current value is: {{ $value }})"	
      - alert: Mysql_High_Recv_Rate
        expr: round(rate(mysql_global_status_bytes__received[5m]) /1024*100)/100 > 102400
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_High_Recv_Rate detected"
          description: "{{$labels.instance}}: Mysql_Receive_Rate is more than 100Mbps ,(current value is: {{ $value }})"	
      - alert: Mysql_High_Send_Rate
        expr: round(rate(mysql_global_status_bytes_sent[5m]) /1024*100)/100 > 102400
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_High_Send_Rate detected"
          description: "{{$labels.instance}}: Mysql data Send Rate is more than 100Mbps ,(current value is: {{ $value }})"
      - alert: Mysql_Too_Many_Slow_Query
        expr: rate(mysql_global_status_slow_queries[30m]) > 3
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_Too_Many_Slow_Query detected"
          description: "{{$labels.instance}}: Mysql current Slow_Query Sql is more than 3 ,(current value is: {{ $value }})"
      - alert: Mysql_Deadlock
        expr: mysql_global_status_innodb_deadlocks > 0
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_Deadlock detected"
          description: "{{$labels.instance}}: Mysql Deadlock was found ,(current value is: {{ $value }})"			
      - alert: Mysql_Too_Many_sleep_threads
        expr: mysql_global_status_threads_running / mysql_global_status_threads_connected * 100 < 3
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_Too_Many_sleep_threads detected"
          description: "{{$labels.instance}}: Mysql_sleep_threads percent is more than {{ $value }}, please clean the sleeping threads"	
      - alert: Mysql_innodb_Cache_insufficient
        expr: (mysql_global_status_innodb_page_size * on (instance) mysql_global_status_buffer_pool_pages{state="data"} +  on (instance) mysql_global_variables_innodb_log_buffer_size +  on (instance) mysql_global_variables_innodb_additional_mem_pool_size + on (instance)  mysql_global_status_innodb_mem_dictionary + on (instance)  mysql_global_variables_key_buffer_size + on (instance)  mysql_global_variables_query_cache_size + on (instance) mysql_global_status_innodb_mem_adaptive_hash )  / on (instance) mysql_global_variables_innodb_buffer_pool_size * 100 > 80
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_innodb_Cache_insufficient detected"
          description: "{{$labels.instance}}: Mysql innodb_Cache was used more than 80% ,(current value is: {{ $value }})"

上面mysql 规则解释

metric expr:
		  # 实例启动时间,单位s，三分钟内有重启记录则告警
		  - mysql_global_status_uptime < 180  
		  
		  # 每秒查询次数指标
		  - rate(mysql_global_status_questions[5m]) > 500
		  
		  # 连接数指标
		  - rate(mysql_global_status_connections[5m]) > 200
		  
		  # mysql接收速率,单位Mbps
		  - rate(mysql_global_status_bytes_received[3m]) * 1024 * 1024 * 8   > 50
		  
		  # mysql传输速率,单位Mbps
		  - rate(mysql_global_status_bytes_sent[3m]) * 1024 * 1024 * 8   > 100
		  
		  # 慢查询
		  - rate(mysql_global_status_slow_queries[30m]) > 3
		  
		  # 死锁
		  - rate(mysql_global_status_innodb_deadlocks[3m]) > 1
		  
		  # 活跃线程小于30%
		  - mysql_global_status_threads_running / mysql_global_status_threads_connected * 100 < 30
		  
		  
		  # innodb缓存占用缓存池大小超过80%
		  - (mysql_global_status_innodb_page_size * on (instance) mysql_global_status_buffer_pool_pages{state="data"} +  on (instance) mysql_global_variables_innodb_log_buffer_size +  on (instance) mysql_global_variables_innodb_additional_mem_pool_size + on (instance)  mysql_global_status_innodb_mem_dictionary + on (instance)  mysql_global_variables_key_buffer_size + on (instance)  mysql_global_variables_query_cache_size + on (instance) mysql_global_status_innodb_mem_adaptive_hash )  / on (instance) mysql_global_variables_innodb_buffer_pool_size * 100 > 80

#读写操作速率参考
sum(rate(mysql_global_status_commands_total{command=~"insert|update|delete",job=~".*mysql"}[5m])) without (command)

#流量接收kb
round(rate(mysql_global_status_bytes__received[5m]) /1024*100)/100 > 102400
#传输接收kb
round(rate(mysql_global_status_bytes_sent[5m]) /1024*100)/100 > 102400

安装redis监控

部署redis-exporter创建redis-exporter.yaml，参照了helm的yaml奈何变量太多，还看不懂说明，直接helm install还是可以的，做包就是个麻烦（吐槽），小白的我还是喜欢写死，灵感来自上面和这篇kafka的监控https://www.tqwba.com/x_d/jishu/359387.html

创建部署redis--exporter.yaml

apiVersion: apps/v1

kind: Deployment

metadata:

  name: redis-exporter

  namespace: monitoring

spec:

  replicas: 1

  selector:

    matchLabels:

      app: redis-exporter

  template:

    metadata:

      labels:

        app: redis-exporter

    spec:

      containers:

      - name: redis-exporter

        image: oliver006/redis_exporter:v1.3.4

        args: ["-redis.addr=redis://ip:6379","-redis.password=密码"]


        ports:

        - containerPort: 9121

          protocol: TCP

---

apiVersion: v1

kind: Service

metadata:

  name: redis-exporter

  namespace: monitoring

  labels:

    app: redis-exporter

spec:

  type: ClusterIP

  ports:

  - port: 9121

    protocol: TCP

    name: http-redis

  selector:

    app: redis-exporter

看看这段yaml是不是很像mysql--exporter.yaml,没错我就是复制粘贴它的格式（有工具创建k8s的yaml格式更好），自己写的不小心弄了个格式错误

部署采集配置，创建redis-exportercaiji.yaml

apiVersion: monitoring.coreos.com/v1

kind: ServiceMonitor

metadata:

  labels:

    k8s-app: redis-exporter

  name: redis-exporter

  namespace: monitoring

spec:

  endpoints:

  - interval: 30s

    port: http-redis

    relabelings:

    - sourceLabels:

      - __meta_kubernetes_service_name

      targetLabel: service_name

  jobLabel: redis-exporter

  namespaceSelector:

    matchNames:

    - monitoring

  selector:

    matchLabels:

      app: redis-exporter

curl http://ip:9121/metrics 没开nodePort ，只能k8s集群里

添加模版11835

redis_memory_max_bytes为0 面板会显示N/A等,别人改数值

这里推荐修改redis配置，maxmemory 8096mb这个值推荐主机内存的80%或者60%，曾经的我就被开发的死循环\长字段，撑爆主机内存，还有挖矿也能搞爆，还可以选择一个策略超过了就删除最早数据，也可以其它策略

vi /etc/redis.conf

# In short... if you have slaves attached it is suggested that you set a lower

# limit for maxmemory so that there is some free RAM on the system for slave 

# output buffers (but this is not needed if the policy is 'noeviction').

#

# maxmemory <bytes>

 

maxmemory 8096mb

redis监控的报警规则，接上面mysql规则后面，也可以新建参考主机报警的格式，步骤都是一样的

其实我觉得的第一条报警不需要写加入了服务，主机哪里也会报警的

      - alert: RedisDown
        expr: redis_up  == 0
        for: 5m
        labels:
          status: error
        annotations:
          summary: "Redis down (instance {{ $labels.instance }})"
          description: "Redis 挂了啊，mmp
		  VALUE = {{ $value }}LABELS: {{ $labels }}"     
      - alert: ReplicationBroken
        expr: delta(redis_connected_slaves[1m]) < 0
        for: 5m
        labels:
          status: error
        annotations:
          summary: "Replication broken (instance {{ $labels.instance }})"
          description: "Redis instance lost a slave
		  VALUE = {{ $value }}LABELS: {{ $labels }}"
      - alert: TooManyConnections
        expr: redis_connected_clients > 1000
        for: 5m
        labels:
          status: warning
        annotations:
          summary: "Too many connections (instance {{ $labels.instance }})"
          description: "Redis instance has too many connections
		  VALUE = {{ $value }}
		  LABELS: {{ $labels }}"       
      - alert: RejectedConnections
        expr: increase(redis_rejected_connections_total[1m]) > 0
        for: 5m
        labels:
          status: error
        annotations:
          summary: "Rejected connections (instance {{ $labels.instance }})"
          description: "Some connections to Redis has been rejected
		  VALUE = {{ $value }}
		  LABELS: {{ $labels }}"

blackbox监控域名

vi blackbox-additional.yaml

kind: Probe
apiVersion: monitoring.coreos.com/v1
metadata:
  name: example-com-website
  namespace: monitoring
spec:
  interval: 60s
  module: http_2xx #什么模块协议
  prober:
    url: blackbox-exporter.monitoring.svc.cluster.local:19115
  targets:
    staticConfig:
      static:
      - https://www.baidu.com
      - https://www.qq.com

kubectl apply -f  blackbox-additional.yaml

报警规则百度吧= =

模版选择 14928，7587，14603

最后送上 prometheus的http://ip:30090/targets面板，grafana的模版中间截图了就不再上了

华为开发者联盟HarmonyOS专区

鸿蒙生态一站式服务平台。

更多推荐

【grafana】使用教程

华为开发者联盟HarmonyOS专区

【PX4-AutoPilot教程-开发环境】使用VMware虚拟机安装Ubuntu系统并搭建PX4开发环境（ROS+mavros+jMAVSim+gazebo+QGC+QT）

学习PX4开发需要先配置好开发环境，对于新手推荐使用VMware虚拟机搭建Ubuntu系统，并下载PX4源码，配置好编译环境和工具链（ROS操作系统+mavros通信包+jMAVSim仿真+gazebo仿真+QGC地面站+QT开发平台）。教程中使用的是Ubuntu18.04系统（官方推荐使用版本），PX4固件版本为v1.13.0，飞控板为pixhawk2.4.8版本。