【Prometheus & Pushgateway】推送数据踩坑

文章目录度量指标值只允许数字类型度量指标值为只能接受最长16位，16位之后数字转为 0pushgateway 数据持久化度量指标值只允许数字类型报错text format parsing error in line 1: expected float as value, got “1.1.1.1”原因：为了可以更好地画图，所以只允许返回数字类型指标值$ echo ...

张同学 ^_^

14957人浏览 · 2019-09-10 15:01:30

张同学 ^_^ · 2019-09-10 15:01:30 发布

文章目录

概述：
由于Prometheus数据采集基于Pull模型进行设计，因此在网络环境的配置上必须要让Prometheus Server能够直接与Exporter进行通信。当这种网络需求无法直接满足时，就可以利用PushGateway来进行中转。
可以通过PushGateway将内部网络的监控数据主动Push到Gateway当中。
而Prometheus Server则可以采用同样Pull的方式从PushGateway中获取到监控数据。
优点：可以像老牌监控一样运维人员可以通过 shell python 脚本自定义监控上报给PushGateway在上报给PrometheusServer，比编写Exporters简单
缺点：当监控项和被监控服务器数量增多，可能会有并发问题使上bao数据相对便慢

pushed metrics are invalid or inconsistent with existing metrics: collected metric

尝试推送一个空的指标，当您尝试一次推送相同的指标两次时，会发生此问题。示例向该指标中添加了多个带有相同标签的样本，或者如果Pushgateway重新启动，您将无法发送相同的指标再次，您必须先将其删除。

参见：https : //github.com/prometheus/pushgateway/blob/master/README.md
状态码为400

python prometheus_client 批量push to pushgateway

https://github.com/liyuanjun/prometheus-python-tutorial/blob/master/exporting/export_pushgateway.py

计算 prometheus 需要的理论内存大小

https://www.robustperception.io/how-much-ram-does-prometheus-2-x-need-for-cardinality-and-ingestion

度量指标值只允许数字类型

报错text format parsing error in line 1: expected float as value, got “1.1.1.1”

原因：
为了可以更好地画图，所以只允许返回数字类型指标值

$ echo ipaddr 1.1.1.1 curl --data-binary @- -g http://ip:9090/metrics/job/pushgateway/instance/test

ipaddr 值为 1.1.1.1 是会报错
text format parsing error in line 1: expected float as value, got "1.1.1.1"

解决方法：
将1.1.1.1 转为数字

function checkIP()
{
    ip=$1
    if [ $ip != "${1#*[0-9].[0-9]}" ]; then
        # IPv4
        a=`echo $ip | awk -F'.' '{print $1}'`
        b=`echo $ip | awk -F'.' '{print $2}'`
        c=`echo $ip | awk -F'.' '{print $3}'`
        d=`echo $ip | awk -F'.' '{print $4}'`
    
        echo "$(((a<<24)+(b<<16)+(c<<8)+d))"
    elif [ "$ip" != "${1#*:[0-9a-fA-F]}" ]; then
        # IPv6
        echo $ip
    else
        echo 0
    fi
}

参考链接： https://github.com/prometheus/prometheus/issues/2227

度量指标值为只能接受最长16位，16位之后数字转为 0

“FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF”：340282366920938463463374607431768211455

$ echo ipaddr 340282366920938463463374607431768211455 \ 
curl --data-binary @- -g http://ip:9090/metrics/job/pushgateway/instance/test

实际结果：
ipaddr{instance="test"}  340282366920938500000000000000000000000

pushgateway 数据持久化

为了防止 pushgateway 重启或意外挂掉，导致数据丢失，我们可以通过 -persistence.file 和 -persistence.interval 参数将数据持久化下来。

prometheus 官网解释

exceeded maximum resolution of 11,000 points per timeseries. Try decreasing the query resolution

当执行该操作时：
GET http://xxx/prometheus/api/v1/query_range?query=bps{mac=~‘xx:xx:xx:xx:xx:xx’}&start=2019-09-19T09:29:26Z&end=2019-09-20T09:29:26Z&step=15s&timeout=60s

原因：prometheus 为每个查询设置了11k数据点的硬限制。
参考链接：
https://github.com/prometheus/prometheus/issues/1968
https://github.com/prometheus/prometheus/issues/2253

docker-compose restart 不会生效新改的docker-compose.yml

必须 docker-compose down

然后 docker-compose up

开启热更新

从 2.0 开始，hot reload 功能是默认关闭的，
如需开启，需要在启动 Prometheus 的时候，添加 --web.enable-lifecycle 参数

热更新加载方法有两种：
kill -HUP pid
curl -X POST http://IP/-/reload  【推荐】

Blackbox_exporter 提示报错：Timeout reading from socket

解决方法：
重启 blackbox 容器

Pushgateway Delete Group 报错：Deleting metric group failed: Bad Request

如果 key="", 会报错Deleting metric group failed: Bad Request
解决：
对每个KEY 设置默认值，保证每一个 key 都有值

PushGateway 推送及 Prometheus 拉取时间设置

Prometheus 每次从 PushGateway 拉取的数据，并不是拉取周期内用户推送上来的所有数据，而是最后一次 Push 到 PushGateway 上的数据，
所以推荐设置推送时间小于或等于 Prometheus 拉取的时间，这样保证每次拉取的数据是最新 Push 上来的。

向您推荐>>Eolink开发者社区

权威｜前沿｜技术｜干货｜国内首个API全生命周期开发者社区

更多推荐

ELK实现containerd的容器日志采集展示【基于logging的全栈监测】

企业级ELK Stack构建介绍

云原生

深入理解 Mocha 测试框架：从零实现一个 Mocha

前言什么是自动化测试自动化测试在很多团队中都是Devops环节中很难执行起来的一个环节，主要原因在于测试代码的编写工作很难抽象，99%的场景都需要和业务强绑定，而且写测试代码的编写工作量往往比编写实际业务代码的工作量更多。在一些很多业务场景中投入产出比很低，适合写自动化测试的应该是那些中长期业务以及一些诸如组件一样的基础库。自动化测试是个比较大的概念，其中分类也比较多，比如单元测试，端对端测试，集

云原生

(20200916 Solved)docker-compose up创建容器自动退出

问题描述如题，创建容器后自动退出了。并且docker start container无效解决方案原因是缺失了控制终端的配置，需要在docker-compose.yml中增加tty:true ，有时候这样也不行，需要再增加一个command:/bin/bash，命令不一定是这个，需要是一个不会退出的命令，然后用-d后台启动容器。Referencesdocker-compose启动容器后自动退出...

云原生

所有评论(0)

查看更多评论

张同学 ^_^

@qq_22227087

已为社区贡献3条内容

【Prometheus & Pushgateway】推送数据踩坑

张同学 ^_^

文章目录

pushed metrics are invalid or inconsistent with existing metrics: collected metric

python prometheus_client 批量push to pushgateway

计算 prometheus 需要的理论内存大小

度量指标值只允许数字类型

度量指标值为只能接受最长16位，16位之后数字转为 0

pushgateway 数据持久化

prometheus 官网解释

度量标签和标签命名

度量标准名称和标签数据模型

exceeded maximum resolution of 11,000 points per timeseries. Try decreasing the query resolution

docker-compose restart 不会生效新改的docker-compose.yml

开启热更新

Blackbox_exporter 提示报错：Timeout reading from socket

Pushgateway Delete Group 报错：Deleting metric group failed: Bad Request

PushGateway 推送及 Prometheus 拉取时间设置

所有评论(0)

张同学 ^_^

【Prometheus & Pushgateway】 推送数据踩坑

张同学 ^_^

文章目录

pushed metrics are invalid or inconsistent with existing metrics: collected metric

python prometheus_client 批量push to pushgateway

计算 prometheus 需要的理论内存大小

度量指标值 只允许 数字类型

度量指标值为 只能接受最长16位，16位之后数字转为 0

pushgateway 数据持久化

prometheus 官网解释

度量标签和标签命名

度量标准名称和标签数据模型

exceeded maximum resolution of 11,000 points per timeseries. Try decreasing the query resolution

docker-compose restart 不会生效新改的docker-compose.yml

开启热更新

Blackbox_exporter 提示报错：Timeout reading from socket

Pushgateway Delete Group 报错：Deleting metric group failed: Bad Request

PushGateway 推送及 Prometheus 拉取时间设置

所有评论(0)

张同学 ^_^

【Prometheus & Pushgateway】推送数据踩坑

度量指标值只允许数字类型

度量指标值为只能接受最长16位，16位之后数字转为 0