Prometheus Operator(二) 监控k8s组件
Prometheus Operator(二) 监控k8s组件默认情况下,prometheus operator已经可以监控我们的集群,但是无法监控kube-controller-manager和kube-scheduler。 这里我们将这2个组件进行监控,并将prometheus和grafana添加traefik。通过ingress进行访问分类文件这里将operator文件进行分类wget -P
Prometheus Operator(二) 监控k8s组件
默认情况下,prometheus operator已经可以监控我们的集群,但是无法监控kube-controller-manager和kube-scheduler。 这里我们将这2个组件进行监控,并将prometheus和grafana添加traefik。通过ingress进行访问
分类文件
这里将operator文件进行分类
wget -P /root/ http://down.i4t.com/abcdocker-prometheus-operator.yaml.zip
cd /root/
unzip abcdocker-prometheus-operator.yaml.zip
mkdir kube-prom
cp -a kube-prometheus-master/manifests/* kube-prom/
cd kube-prom/
mkdir -p node-exporter alertmanager grafana kube-state-metrics prometheus serviceMonitor adapter operator
mv *-serviceMonitor* serviceMonitor/
mv setup operator/
mv grafana-* grafana/
mv kube-state-metrics-* kube-state-metrics/
mv alertmanager-* alertmanager/
mv node-exporter-* node-exporter/
mv prometheus-adapter* adapter/
mv prometheus-* prometheus/
mv 0prometheus-operator-* operator/
mv 00namespace-namespace.yaml operator/
## 安装顺序也需要改变 (之前已经安装也可以跳过)
[root@k8s-01 kube-prom]# kubectl apply -f operator/
namespace/monitoring created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
service/prometheus-operator created
serviceaccount/prometheus-operator created
Pod启动了就可以执行剩下的
[root@k8s-01 kube-prom]# kubectl -n monitoring get pod
NAME READY STATUS RESTARTS AGE
prometheus-operator-69bd579bf9-7kpd7 1/1 Running 0 7s
#剩下步骤
kubectl apply -f adapter/
kubectl apply -f alertmanager/
kubectl apply -f node-exporter/
kubectl apply -f kube-state-metrics/
kubectl apply -f grafana/
kubectl apply -f prometheus/
kubectl apply -f serviceMonitor/
执行完检查没问题就可以结束了
[root@k8s-01 kube-prom]# kubectl get -n monitoring all
配置Ingress
首先需要先安装traefik,node-port方式效率不行,建议使用traefik
环境初始化
首先我们需要将prometheus operator中的svc类型都修改为ClusterIP,如果默认没有修改的话,默认就是ClusterIP
[root@k8s-01 ingress]# kubectl get pod,svc -n monitoring
NAME READY STATUS RESTARTS AGE
pod/alertmanager-main-0 2/2 Running 0 88s
pod/alertmanager-main-1 2/2 Running 0 77s
pod/alertmanager-main-2 2/2 Running 0 69s
pod/grafana-558647b59-mj85j 1/1 Running 0 96s
pod/kube-state-metrics-5bfc7db74d-kpgh2 4/4 Running 0 96s
pod/node-exporter-5kz8x 2/2 Running 0 94s
pod/node-exporter-jnmr7 2/2 Running 0 94s
pod/node-exporter-pztln 2/2 Running 0 93s
pod/node-exporter-ts455 2/2 Running 0 94s
pod/prometheus-adapter-57c497c557-6tscz 1/1 Running 0 91s
pod/prometheus-k8s-0 3/3 Running 1 78s
pod/prometheus-k8s-1 3/3 Running 1 78s
pod/prometheus-operator-69bd579bf9-rrf96 1/1 Running 1 98s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-main ClusterIP 10.254.201.109 9093/TCP 99s
service/alertmanager-operated ClusterIP None 9093/TCP,6783/TCP 89s
service/grafana ClusterIP 10.254.19.174 3000/TCP 97s
service/kube-state-metrics ClusterIP None 8443/TCP,9443/TCP 96s
service/node-exporter ClusterIP None 9100/TCP 95s
service/prometheus-adapter ClusterIP 10.254.197.151 443/TCP 93s
service/prometheus-k8s ClusterIP 10.254.120.188 9090/TCP 89s
service/prometheus-operated ClusterIP None 9090/TCP 78s
service/prometheus-operator ClusterIP None 8080/TCP 99s
接下来我们为prometheus ui和grafana以及alertmanager创建ingress
(可以分开写,不写在一个文件里面)
vim ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: prometheus-ing
namespace: monitoring
spec:
rules:
- host: prometheus.i4t.com
http:
paths:
- backend:
serviceName: prometheus-k8s
servicePort: 9090
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: grafana-ing
namespace: monitoring
spec:
rules:
- host: grafana.i4t.com
http:
paths:
- backend:
serviceName: grafana
servicePort: 3000
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: alertmanager-ing
namespace: monitoring
spec:
rules:
- host: alertmanager.i4t.com
http:
paths:
- backend:
serviceName: alertmanager-main
servicePort: 9093
## host为域名,serviceName是prometheus的svc名称和端口
[root@k8s-01 ingress]# kubectl apply -f ingress.yaml
ingress.extensions/prometheus-operator created
[root@k8s-01 ingress]# kubectl get ingress -n monitoring
NAME HOSTS ADDRESS PORTS AGE
alertmanager-ing alertmanager.i4t.com 80 13s
grafana-ing grafana.i4t.com 80 13s
prometheus-ing prometheus.i4t.com 80 13s
我们也可以在ui界面查看traefik
接下来进行域名解析 (我这里使用修改host方式演示)
#mac
➜ ~ sudo vim /etc/hosts
Password:
#windows
C:\Windows\System32\drivers\etc
监控k8s组件
这里我们可以看到,prometheus operator并没有监控到kube-controller-manager
和scheduler
由于我这里是二进制安装,所以并没有获取到相关的信息
这是由于serverMonitor根据label去选取svc的,我们可以看到对应的serviceMonitor
选取的范围是kube-system
[root@k8s-01 manifests]# grep -2 selector prometheus-serviceMonitorKube*
prometheus-serviceMonitorKubeControllerManager.yaml- matchNames:
prometheus-serviceMonitorKubeControllerManager.yaml- - kube-system
prometheus-serviceMonitorKubeControllerManager.yaml: selector:
prometheus-serviceMonitorKubeControllerManager.yaml- matchLabels:
prometheus-serviceMonitorKubeControllerManager.yaml- k8s-app: kube-controller-manager
--
prometheus-serviceMonitorKubelet.yaml- matchNames:
prometheus-serviceMonitorKubelet.yaml- - kube-system
prometheus-serviceMonitorKubelet.yaml: selector:
prometheus-serviceMonitorKubelet.yaml- matchLabels:
prometheus-serviceMonitorKubelet.yaml- k8s-app: kubelet
--
prometheus-serviceMonitorKubeScheduler.yaml- matchNames:
prometheus-serviceMonitorKubeScheduler.yaml- - kube-system
prometheus-serviceMonitorKubeScheduler.yaml: selector:
prometheus-serviceMonitorKubeScheduler.yaml- matchLabels:
prometheus-serviceMonitorKubeScheduler.yaml- k8s-app: kube-scheduler
而kube-system默认里也没有符合标签的label
[root@k8s-01 manifests]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.254.0.2 53/UDP,53/TCP,9153/TCP 31d
kubelet ClusterIP None 10250/TCP 2d8h
kubernetes-dashboard NodePort 10.254.194.101 80:30000/TCP 31d
traefik-ingress-service NodePort 10.254.160.25 80:23633/TCP,8080:15301/TCP 38m
但是却有endpoint (我这里二进制安装有)
[root@k8s-01 manifests]# kubectl get ep -n kube-system
NAME ENDPOINTS AGE
kube-controller-manager 31d
kube-dns 172.30.248.2:53,172.30.72.4:53,172.30.248.2:53 + 3 more... 31d
kube-scheduler 31d
kubelet 192.168.0.10:10255,192.168.0.11:10255,192.168.0.12:10255 + 9 more... 2d8h
kubernetes-dashboard 172.30.232.2:8443 31d
traefik-ingress-service 172.30.232.5:80,172.30.232.5:8080 39m
解决办法
这里创建两个管理组件的svc,将svc的label设置为k8s-app: {kube-controller-manager、kube-scheduler}
,这样就可以被servicemonitor选中
创建一个svc用来绑定
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-controller-manager
labels:
k8s-app: kube-controller-manager
spec:
selector:
component: kube-controller-manager
type: ClusterIP
clusterIP: None
ports:
- name: http-metrics
port: 10252
targetPort: 10252
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler
labels:
k8s-app: kube-scheduler
spec:
selector:
component: kube-scheduler
type: ClusterIP
clusterIP: None
ports:
- name: http-metrics
port: 10251
targetPort: 10251
protocol: TCP
手动填写svc对应的ep的属性,ep的名称要和svc名称和属性对应上
apiVersion: v1
kind: Endpoints
metadata:
labels:
k8s-app: kube-controller-manager
name: kube-controller-manager
namespace: kube-system
subsets:
- addresses:
- ip: 192.168.0.10
- ip: 192.168.0.11
- ip: 192.168.0.12
ports:
- name: http-metrics
port: 10252
protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
labels:
k8s-app: kube-scheduler
name: kube-scheduler
namespace: kube-system
subsets:
- addresses:
- ip: 192.168.0.10
- ip: 192.168.0.11
- ip: 192.168.0.12
ports:
- name: http-metrics
port: 10251
protocol: TCP
我们查看一下svc,已经和我们ep进行绑定
[root@k8s-01 test]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-controller-manager ClusterIP None 10252/TCP 64s
kube-dns ClusterIP 10.254.0.2 53/UDP,53/TCP,9153/TCP 31d
kube-scheduler ClusterIP None 10251/TCP 64s
kubelet ClusterIP None 10250/TCP 2d9h
kubernetes-dashboard NodePort 10.254.194.101 80:30000/TCP 31d
traefik-ingress-service NodePort 10.254.160.25 80:23633/TCP,8080:15301/TCP 126m
[root@k8s-01 test]# kubectl describe svc -n kube-system kube-scheduler
Name: kube-scheduler
Namespace: kube-system
Labels: k8s-app=kube-scheduler
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"k8s-app":"kube-scheduler"},"name":"kube-scheduler","namespace"...
Selector: component=kube-scheduler
Type: ClusterIP
IP: None
Port: http-metrics 10251/TCP
TargetPort: 10251/TCP
Endpoints: 192.168.0.10:10251,192.168.0.11:10251,192.168.0.12:10251
Session Affinity: None
Events:
我这里master就3个所以scheduler和kube-controller-manager就只有3个
针对kubeadm可以参考下面的解决方法,由于我这里没有环境所以不进行演示
apiVersion: v1
kind: Endpoints
metadata:
labels:
k8s-app: kubelet
name: kubelet
namespace: kube-system
subsets:
- addresses:
- ip: 172.16.0.14
targetRef:
kind: Node
name: k8s-n2
- ip: 172.16.0.18
targetRef:
kind: Node
name: k8s-n3
- ip: 172.16.0.2
targetRef:
kind: Node
name: k8s-m1
- ip: 172.16.0.20
targetRef:
kind: Node
name: k8s-n4
- ip: 172.16.0.21
targetRef:
kind: Node
name: k8s-n5
ports:
- name: http-metrics
port: 10255
protocol: TCP
- name: cadvisor
port: 4194
protocol: TCP
- name: https-metrics
port: 10250
protocol: TCP
如果我们添加监控后提示ip:10251 Connection refused
- 二进制安装
需要修改scheduler的配置文件
在启动文件中添加--bind-address=0.0.0.0
- kubeadm安装
需要在在修改Pod中添加,我不太了解kubeadm这里不过多说明
更多推荐
所有评论(0)