SpringBoot整合pushgateway、Alertmanager做监控报警
这里需要通过pushgateway推送数据引入依赖<!--普罗米修斯依赖--><dependency><groupId>io.prometheus</groupId><artifactId>simpleclient_spring_boot<...
·
这里需要通过pushgateway推送数据
引入依赖
<!--普罗米修斯依赖-->
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_spring_boot</artifactId>
<version>0.8.0</version>
</dependency>
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_hotspot</artifactId>
<version>0.8.0</version>
</dependency>
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_servlet</artifactId>
<version>0.8.0</version>
</dependency>
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_pushgateway</artifactId>
<version>0.8.0</version>
</dependency>
SpringBoot代码
这里我在捕获到异常后将code和服务名推送到pushgateway中
@ControllerAdvice
public class GlobalExceptionHandler {
@Value("${spring.application.name}")
String applicationName;
@Value("${pushgateway.ip}")
String pushgatewayIp;
public static final Counter counterDemo = Counter.build()
.name("push_way_counter")
.labelNames("code", "instance")
.help("user-service异常统计")
.register();
/**
* 传入code做异常统计
*
* @param code
*/
private void pushData(Integer code) {
//统计异常
PushGateway prometheusPush = new PushGateway(pushgatewayIp);
//指标值增加
counterDemo.labels(code.toString(), applicationName).inc();
try {
prometheusPush.push(counterDemo, "ex-user-service");
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 此处只是一个例子函数, 用来处理 ResourceConflictException 实际上一个 handler 也可以处理多种 Exception.
* 也可以把@ResponseStatus 放在 handler 的前面, 这样多个 Exception 可以用同一个 HTTP 返回 code
* <p>
* 不一定每个 Exception 都需要专门处理, 只需要在Exception定义的前面加入@ResponseStatus 定义 HTTP 返回 code
* 即可.
*
* @param err
* @return
*/
@ResponseStatus(HttpStatus.CONFLICT)
@ExceptionHandler(ResourceConflictException.class)
@ResponseBody
public CommonResult<Object> handleResourceBusyException(ResourceConflictException err) {
CommonResult<Object> errorInfo = new CommonResult<>();
errorInfo.setSuccess(false);
errorInfo.setCodes(201);
errorInfo.setMessage(err.getMessage());
errorInfo.setData(err.getData());
return errorInfo;
}
/**
* 用戶模塊全局異常信息處理
*
* @param err
* @return
*/
@ExceptionHandler(UserException.class)
@ResponseBody
public CommonResult<Object> handleUserException(HttpServletRequest request, UserException err) {
pushData(err.getCode());
CommonResult<Object> errorInfo = new CommonResult<>();
errorInfo.setSuccess(false);
errorInfo.setCodes(err.getCode());
errorInfo.setMessage(err.getMessage());
errorInfo.setData(err.getData());
errorInfo.setSuccess(false);
return errorInfo;
}
}
定义新的报警规则
vim springboot_rules.yml
groups:
- name: springboot-rules
rules:
- alert: interface_status
expr: sum by (code,exported_job) (increase(push_way_counter[5m])) >3
for: 10s
labels:
status: 非常严重
annotations:
summary: "接口报错5分钟内超过3次!!!--{{$labels.code}}"
修改Prometheus.yml
vim prometheus.yml
rule_files:
- "alertmanager_rules.yml"
- "springboot_rules.yml"
重启!!!
测试如下:
更多推荐
已为社区贡献24条内容
所有评论(0)