SpringBoot+Prometheus+Grafana+Alertmanager+WebHook钉钉报警

Docker安装

#升级yum
sudo yum update  

#卸载旧版本
sudo yum remove docker  docker-common docker-selinux docker-engine  

#安装依赖  
sudo yum install -y yum-utils device-mapper-persistent-data lvm2  

#设置源  
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo   

# 更新yum缓存
yum makecache fast

# 安装Docker
yum -y install docker-ce

# 启动
systemctl start docker

查看是否安装成功

# 查看是否启动成功
docker info

出现以下信息,则安装成功
成功信息

# 开机自启
systemctl enable docker

如果出现 Cannot connect to the Docker daemon at unix:///var/run/docker.sock.

# Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the d
systemctl restart docker  #重启一下就行

在下载镜像前,需要设置一下国内源,用来提高下载速度,

创建文件: sudo vim /etc/docker/daemon.json

添加如下代码:

{  
    "registry-mirrors": ["https://d7grpode.mirror.aliyuncs.com"]  
}

最后重启,docker就安装好了

# 重启
sudo systemctl restart docker

Prometheus安装

创建Prometheus文件目录

mkdir -p /mydata/prometheus
cd /mydata/prometheus/
vim prometheus.yml

配置相关文件

global:
  scrape_interval:     60s
  evaluation_interval: 60s
 
scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ['localhost:9090']
        labels:
          instance: prometheus
  - job_name: node-exporter
    static_configs:
      - targets: ['localhost:9100']
        labels:
          instance: localhost

Grafana安装

mkdir -p /mydata/grafana-storage
# 因为grafana会自己写数据进来,所以设置一下权限
chmod 777 -R /mydata/grafana-storage

/*启动服务*/
docker run -d \
  -p 31014:3000 \
  --name=grafana \
  -v /mydata/grafana-storage:/var/lib/grafana \
  grafana/grafana
# 初始化用户密码: admin/admin

在这里插入图片描述

Alertmanager安装

创建目录

mkdir -p /mydata/alertmanager
cd /mydata/alertmanager

创建配置文件

vim alertmanager.yml

/**alertmanager.yml**/

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  routes:

  - receiver: web.hook1
    group_wait: 10s
    match:
      team: webhook1

  - receiver: web.hook2
    group_wait: 10s
    match:
      team: webhook2
receivers:
- name: 'web.hook1'
  webhook_configs:
  - url: http://localhost:8060/dingtalk/webhook1/send
    send_resolved: true
- name: 'web.hook2'
  webhook_configs:
  - url: http://localhost:8060/dingtalk/webhook2/send
    send_resolved: true
    

拉取镜像并启动

docker run -d \
    --name alertmanager \
    -p 9093:9093 \
    -v /mydata/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml \
    prom/alertmanager

在这里插入图片描述

设置告警规则

mkdir -p /mydata/alertmanager/rules
cd /mydata/alertmanager/rules
#创建规则
vim node-up.yml

/**node-up.rules**/
groups:
- name: node-up
  rules:
  - alert: node-up
    expr: up{job="node-exporter"} == 0
    for: 15s
    labels:
      severity: 1
      team: node
    annotations:
      summary: "{{ $labels.instance }} 已停止运行超过 15s!"
# expr:up{job=“node-exporter”} == 0表示 服务下线
# for:15s 表示持续15s
# annotations.summary 表示提示语
# 上述配置表示job(node-exporter)下线持续15s,则启动告警

这时,配置prometheus配置文件,指定告警路由和规则

...
# Altermanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - localhost:9093
rule_files:
  - "/usr/local/alertmanager/rules/*.yml"

启动Prometheus

# 修改prometheus启动参数
docker run  -d \
  --name prometheus \
  -p 9090:9090 \
  -v /mydata/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml  \
  -v /mydata/alertmanager/rules/:/usr/local/alertmanager/rules/ \
  prom/prometheus

在这里插入图片描述

钉钉告警

创建文件

mkdir -p  /mydata/dingtalk
cd /mydata/dingtalk
vim config.yml

配置文件

## Request timeout
# timeout: 5s

## Customizable templates path
templates:
  - /etc/prometheus-webhook-dingtalk/templates/legacy/template.tmpl

## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
# default_message:
#   title: '{{ template "legacy.title" . }}'
#   text: '{{ template "legacy.content" . }}'

## Targets, previously was known as "profiles"
targets:
  webhook1:
    url: https://oapi.dingtalk.com/robot/send?access_token=xxx
    # secret for signature
    secret: xxx
  webhook2:
    url: https://oapi.dingtalk.com/robot/send?access_token=xxx
    secret: xxx
    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
    # Customize template content
    message:
      # Use legacy template
      title: '{{ template "legacy.title" . }}'
      text: '{{ template "legacy.content" . }}'
  webhook_mention_all:
    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
    mention:
      all: true
  webhook_mention_users:
    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
    mention:
      mobiles: ['156xxxx8827', '189xxxx8325']

配置模板

{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] 
{{ .GroupLabels.SortedPairs.Values | join " " }} 
{{ if gt (len .CommonLabels) (len .GroupLabels) }}
({{ with .CommonLabels.Remove .GroupLabels.Names }}{{ .Values | join " " }}{{ end }}){{ end }}{{ end }}{{ define "__alertmanagerURL" }}{{ .ExternalURL }}/#/alerts?receiver={{ .Receiver }}{{ end }}
{{ define "__text_alert_list" }}{{ range . }}

**Labels**

{{ range .Labels.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }}

{{ end }}

**Annotations**

{{ range .Annotations.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }}

{{ end }}

**Source:** [{{ .GeneratorURL }}]({{ .GeneratorURL }})

{{ end }}{{ end }}
{{ define "default.__text_alert_list" }}{{ range . }}

---

**告警级别:** {{ .Labels.severity | upper }}
**运营团队:** {{ .Labels.team | upper }}
**触发时间:** {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
**事件信息:**

{{ range .Annotations.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }}
{{ end }}
**事件标签:**

{{ range .Labels.SortedPairs }}{{ if and (ne (.Name) "severity") (ne (.Name) "summary") (ne (.Name) "team") }}> - {{ .Name }}: {{ .Value | markdown | html }}

{{ end }}{{ end }}

{{ end }}

{{ end }}

{{ define "default.__text_alertresovle_list" }}{{ range . }}

---

**告警级别:** {{ .Labels.severity | upper }}
**运营团队:** {{ .Labels.team | upper }}
**触发时间:** {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
**结束时间:** {{ dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}
**事件信息:**

{{ range .Annotations.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }}
{{ end }}
**事件标签:**

{{ range .Labels.SortedPairs }}{{ if and (ne (.Name) "severity") (ne (.Name) "summary") (ne (.Name) "team") }}> - {{ .Name }}: {{ .Value | markdown | html }}

{{ end }}{{ end }}

{{ end }}

{{ end }}
{{/* Default */}}

{{ define "default.title" }}{{ template "__subject" . }}{{ end }}

{{ define "default.content" }}
 [{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
 **[{{ index .GroupLabels "alertname" }}]({{ template "__alertmanagerURL" . }})**

{{ if gt (len .Alerts.Firing) 0 -}}
![警报 图标](https://ss0.bdstatic.com/70cFuHSh_Q1YnxGkpoWK1HF6hhy/it/u=3626076420,1196179712&fm=15&gp=0.jpg)
**====侦测到故障====**

{{ template "default.__text_alert_list" .Alerts.Firing }}
{{- end }}
{{ if gt (len .Alerts.Resolved) 0 -}}

{{ template "default.__text_alertresovle_list" .Alerts.Resolved }}
{{- end }}

{{- end }}
{{/* Legacy */}}

{{ define "legacy.title" }}{{ template "__subject" . }}{{ end }}

{{ define "legacy.content" }} 
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
 **[{{ index .GroupLabels "alertname" }}]({{ template "__alertmanagerURL" . }})**

{{ template "__text_alert_list" .Alerts.Firing }}

{{- end }}
{{/* Following names for compatibility */}}

{{ define "ding.link.title" }}{{ template "default.title" . }}{{ end }}

{{ define "ding.link.content" }}{{ template "default.content" . }}{{ end }}

docker run -d \
--name dingtalk \
--restart always \
-p 8060:8060 \
# 配置文件
-v /mydata/dingtalk/config.yml:/etc/prometheus-webhook-dingtalk/config.yml \
# 告警模板
-v /mydata/dingtalk/template.tmpl:/etc/prometheus-webhook-dingtalk/templates/legacy/template.tmpl \
timonwong/prometheus-webhook-dingtalk:master \
# 开启web页面
--web.enable-ui \  
--log.format=logfmt \
--config.file=/etc/prometheus-webhook-dingtalk/config.yml

这样就配置好了
在这里插入图片描述

SpringBoot集成Prometheus

导入依赖包

   <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-actuator</artifactId>
        </dependency>
        <dependency>
            <groupId>io.micrometer</groupId>
            <artifactId>micrometer-registry-prometheus</artifactId>
            <version>1.8.3</version>
        </dependency>

配置yml文件

#释放全部信息
management:
  endpoints:
    web:
      exposure:
        include: '*'

  #查看健康详细信息
  endpoint:
    health:
      show-details: always

定义controller

@RestController
public class TempController {

    @RequestMapping("info")
    public String info(){
        return "hello info";
    }
}

修改Prometheus配置文件

vim /mydata/prometheus/prometheus.yml

在scrape_configs中加入

- job_name: springboot-actuator
    static_configs:
      - targets: ['localhost:8080']
        labels:
          instance: springboot-actuator

重启Prometheus
在这里插入图片描述

Logo

更多推荐