k8s之Prometheus-node-down解决
当prometheus配置node_exporter后,却发现node_exporter的状态是Down的
·
# prometheus-server配置如下
[root@k8s-master-1 prometheus-server]# cat prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: kube-prometheus
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 1m
scrape_configs:
- job_name: 'kubernetes-node'
kubernetes_sd_configs:
- role: node
- job_name: 'prometheus-server'
static_configs:
- targets: ['localhost:9090']
relabel_configs:
- source_labels: [instance]
regex: '(.*)'
replacement: 'prometheus-server:9090'
target_label: instance
action: replace
- 通过上述可以发现,当我们在node_exporter部署好了以后,默认会去http://IP:10250/metrics拉取数据,但是我们一般都是设置node_exporter与宿主机共享网络,所以应该是要去http://IP:9100/metrics拉取数据,故而要重新relabels一下标签
# prometheus配置
[root@k8s-master-1 prometheus-server]# cat prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: kube-prometheus
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 1m
scrape_configs:
- job_name: 'kubernetes-node'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__] # 将__address__=192.168.0.10:10250 -> 192.168.0.10:9100,node_exporter默认是9100,且我已事先与宿主机共享网络空间了
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
- action: labelmap
regex: __meta__kubernetes_node_label_(.*) # 保留__meta__kubernetes_node_label_标签后面的值作为新标签
- job_name: 'prometheus-server'
static_configs:
- targets: ['localhost:9090']
relabel_configs:
- source_labels: [instance] # 将instance=localhost:9090 -> instance=prometheus-server:9090
regex: '(.*)'
replacement: 'prometheus-server:9090'
target_label: instance
action: replace
可以发现将__address__修改成IP:9100后,即可正常获取metrics
更多推荐
已为社区贡献43条内容
所有评论(0)