prometheus+k8s+jmx_export遇到的坑-记录
一、简单介绍下操作步骤以下操作是k8s已经集成了prometheus-operator,关于k8s集成prometheus-operator可使用kube-prometheus进行集成;二、自动发现集成的jmx_export1.准备jmx_export集成的jar包jmx_exporter2.编辑k8s Service和deployment资源,例如下面;jmx_export我已经集成到docke
·
一、简单介绍下操作步骤
以下操作是k8s已经集成了prometheus-operator,关于k8s集成prometheus-operator可使用kube-prometheus进行集成;
二、自动发现集成的jmx_export
- 1.准备jmx_export集成的jar包jmx_exporter
- 2.编辑k8s Service和deployment资源,例如下面;jmx_export我已经集成到docker镜像中,你们随意,空间名随意;里面没有特别注意的地方,就正常启动一个pod服务。
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: jmx-exporter
name: example-app
namespace: qiaofeng-namespace
spec:
selector:
k8s-app: prometheus-app
ports:
- name: http
port: 8080
targetPort: http
- name: jmx-metrics
port: 8081
targetPort: jmx-metrics
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: prometheus-app
name: example-app
namespace: qiaofeng-namespace
spec:
selector:
matchLabels:
k8s-app: prometheus-app
replicas: 1
template:
metadata:
labels:
k8s-app: prometheus-app
spec:
containers:
- name: example-app
image: docker.qingtui.cn:6000/jmx_prometheus_test:1.0.0
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: jmx-metrics
containerPort: 8081
坑点内容开始
- 3. 配置自动发现的ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: jmx-exporter
app: prometheus-app
name: jxm-exproter
namespace: monitoring
spec:
endpoints:
- interval: 10s
port: jmx-metrics
tlsConfig:
insecureSkipVerify: true
jobLabel: k8s-app
namespaceSelector:
any: true
selector:
matchLabels:
k8s-app: jmx-exporter
注意 spec.selector.matchLabels下的标签,内容为service的labels;该服务的功能表示监听所有service中包含k8s-app: jmx-exporter标签的服务,any: true 表示会监听所有的空间;若需要指定空间名,通过下方配置
namespaceSelector:
matchNames:
- 需要监控的空间名
- 4. 被坑了一天,因为我们用ServiceMonitor资源进行监听发现,该资源属于prometheus-operator的,所以我们变相使用的是prometheus-operator的用户和权限;(官方给的默认情况下prometheus-operator用户权限是足够的);但你将ServiceMonitor服务运行后,在prometheus页面上却没有发现指标服务,什么原因呢?
问题在于prometheus-clusterRole的权限不够,此时你查看prometheus服务的日志,执行命令
kubectl logs -f prometheus-k8s-0 prometheus -n monitoring
应该会看到类似
level=error ts=2022-02-23T02:20:07.690Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth
msg="/app/discovery/kubernetes/kubernetes.go:362: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"qiaofeng-namespace\""
解决方案:配置足够的权限,修改prometheus-clusterRole.yaml文件中的内容,若是官方默认的,直接覆盖即可
#增加prometheus-k8s用户权限
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus-k8s
rules:
- apiGroups:
- ""
resources:
- nodes/metrics
- services
- nodes
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- nonResourceURLs:
- /metrics
verbs:
- get
重新加载运行,查看prometheus页面,就会有了;
价值博客参考:K8s 集群监控之Kube Prometheus(Prometheus Operator) - 简书
Prometheus监控k8s(12)-PrometheusOperator服务自动发现-监控redis样例 - 光阴8023 - 博客园
更多推荐
已为社区贡献2条内容
所有评论(0)