创建sa账号在k8s集群的master节点操作


#创建一个sa账号

对sa账号授权,这样普罗米修斯才能对k8s集群有一定的权限,采集其他节点的信息。、

[root@master ~]# kubectl create serviceaccount monitor -n monitor
serviceaccount/monitor created

 #把sa账号monitor通过clusterrolebing绑定到clusterrole上

[root@master prometheus]# kubectl create clusterrolebinding monitor-clusterrolebinding -n monitor --clusterrole=cluster-admin --serviceaccount=monitor:monitor
clusterrolebinding.rbac.authorization.k8s.io/monitor-clusterrolebinding created

这个clusterrole具有管理员的角色,那么这个sa就可以访问k8s上面所有的资源了。

 

Relabeler - The playground for Prometheus relabeling rules

Kubernetes 基于角色node自动发现  node_exporter  cadvisor 


      scrape_interval: 15s  #数据采集间隔
      scrape_timeout: 10s  # 数据采集超时时间,默认10s
      evaluation_interval: 1m   # 评估告警周期

scrape_configs:配置数据源,称为target,每个target用job_name命名。又分为静态配置和服务发现 。

k8s服务发现角色有很多,如果使用node就会使用kubelet提供的http端口来发现集群当中的每个node节点 。

标签重新标记:默认采集上来的数据需要重新标记:因为要通过node_exporter获取数据,要将默认的10250变为9100才能采集到node节点数据。

      relabel_configs:
#重新标记
      - source_labels: [__address__] #配置的原始标签,匹配地址
        regex: '(.*):10250'   #匹配带有10250端口的ip:10250
        replacement: '${1}:9100'  #把匹配到的ip:10250的ip保留替换成${1}
        target_label: __address__ #新生成的地址
        action: replace
[root@master ~]# netstat -tpln | grep 10250
tcp6       0      0 :::10250                :::*                    LISTEN      482/kubelet 
[root@master prometheus]# netstat -tpln | grep 9100
tcp6       0      0 :::9100                 :::*                    LISTEN      22132/node_exporter 

  labelmap #匹配到下面正则表达式的标签会被保留

    scrape_configs:
#scrape_configs:配置数据源,称为target,每个target用job_name命名。又分为静态配置和服务发现
    - job_name: 'kubernetes-node'
      kubernetes_sd_configs:
#使用的是k8s的服务发现
      - role: node
# 使用node角色,它使用默认的kubelet提供的http端口来发现集群中每个node节点。
      relabel_configs:
#重新标记
      - source_labels: [__address__] #配置的原始标签,匹配地址
        regex: '(.*):10250'   #匹配带有10250端口的ip:10250
        replacement: '${1}:9100'  #把匹配到的ip:10250的ip保留替换成${1}
        target_label: __address__ #新生成的地址
        action: replace
      - action: labelmap #匹配到下面正则表达式的标签会被保留
        regex: __meta_kubernetes_node_label_(.+)

node

该角色发现每个群集节点的一个目标,该地址默认为 Kubelet 的 HTTP 端口。目标地址默认为地址类型顺序 、和 中的 Kubernetes 节点对象的第一个现有地址。node``NodeInternalIP``NodeExternalIP``NodeLegacyHostIP``NodeHostName

可用的元标签:

  • __meta_kubernetes_node_name:节点对象的名称。
  • __meta_kubernetes_node_label_<labelname>:节点对象中的每个标签。
  • __meta_kubernetes_node_labelpresent_<labelname>:对于节点对象的每个标签。true
  • __meta_kubernetes_node_annotation_<annotationname>:节点对象中的每个注释。
  • __meta_kubernetes_node_annotationpresent_<annotationname>:对于节点对象的每个注释。true
  • __meta_kubernetes_node_address_<address_type>:每个节点地址类型的第一个地址(如果存在)。

此外,节点的标签将设置为从 API 服务器检索的节点名称。instance

- job_name: 'kubernetes-node-cadvisor'
# 抓取cAdvisor数据,是获取kubelet上/metrics/cadvisor接口数据来获取容器的资源使用情况
      kubernetes_sd_configs:
      - role:  node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
[root@master ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.233.0.1   <none>        443/TCP   45h

 

上面基于node角色的服务发现,最后抓取指标的路径为 scheme + __address__ + __metrics_path__

  •  node_exporter:${1}:9100  +  /metrics

  •  cadvisor:kubernetes.default.svc:443  +  /api/v1/nodes/${1}/proxy/metrics/cadvisor

Kubernetes 基于角色endpoints 自动发现 Apiserver


基于不同的角色的服务发现,源标签是不一样的。

基于k8s的服务发现,这里使用的角色是endpoints 

    - job_name: 'kubernetes-apiserver'
      kubernetes_sd_configs:
      - role: endpoints
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_namespace #endpoint这个对象的名称空间,__meta_kubernetes_service_name #endpoint对象的服务名字, __meta_kubernetes_endpoint_port_name #endpoint的端口名称]
        action: keep
        regex: default;kubernetes;https

# 重新打标仅抓取到的具有 "prometheus.io/scrape: true" 的annotation的端点,意思是说如果某个service具有prometheus.io/scrape = true annotation声明则抓取 (这里配置了专门的服务发现,将注解里面的标签replace为目标 schema,path,metrics_path)

#正则匹配到的默认空间下的service名字是kubernetes,协议是https的endpoint类型保留下来


    - job_name: 'kubernetes-service-endpoints'
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        action: keep
        regex: true
# 重新打标仅抓取到的具有 "prometheus.io/scrape: true" 的annotation的端点,意思是说如果某个service具有prometheus.io/scrape = true annotation声明则抓取,annotation本身也是键值结构,所以这里的源标签设置为键,而regex设置值true,当值匹配到regex设定的内容时则执行keep动作也就是保留,其余则丢弃。



      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
        action: replace
        target_label: __scheme__
        regex: (https?)
#重新设置scheme,匹配源标签__meta_kubernetes_service_annotation_prometheus_io_scheme也就是
prometheus.io/scheme annotation,如果源标签的值匹配到regex,则把值替换为__scheme__对应的值。



      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)

(因为有些指标路径不是默认的/metrics,有些时候应用自定义暴露的指标接口,可能不是/metrics,如果是其他接口,可以声明其他接口)

# 应用中自定义暴露的指标,也许你暴露的API接口不是/metrics这个路径,那么你可以在这个POD对应的service中做一个"prometheus.io/path = /mymetrics" 声明,上面的意思就是把你声明的这个路径赋值给__metrics_path__,其实就是让prometheus来获取自定义应用暴露的metrices的具体路径,不过这里写的要和service中做好约定,如果service中这样写 prometheus.io/app-metrics-path: '/metrics' 那么你这里就要__meta_kubernetes_service_annotation_prometheus_io_app_metrics_path这样写。



      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
# 暴露自定义的应用的端口,就是把地址和你在service中定义的 "prometheus.io/port = <port>" 声明做一个拼接,然后赋值给__address__,这样prometheus就能获取自定义应用的端口,然后通过这个端口再结合__metrics_path__来获取指标,如果__metrics_path__值不是默认的/metrics那么就要使用上面的标签替换来获取真正暴露的具体路径。



      - action: labelmap  #添加额外信息的
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        target_label: kubernetes_name 

[root@master ~]# kubectl get pod -n kube-system -o wide
NAME                                       READY   STATUS    RESTARTS   AGE   IP              NODE     NOMINATED NODE   READINESS GATES
coredns-867b49865c-f6qbh                   1/1     Running   2          45h   10.233.96.13    node2    <none>           <none>
coredns-867b49865c-m9hx4                   1/1     Running   2          45h   10.233.90.9     node1    <none>           <none>
[root@master ~]# kubectl get svc  -n kube-system 
NAME      TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
coredns   ClusterIP   10.233.0.3   <none>        53/UDP,53/TCP,9153/TCP   45h
[root@master ~]# kubectl get ep  -n kube-system 
NAME                      ENDPOINTS                                                   AGE
coredns                   10.233.90.9:53,10.233.96.13:53,10.233.90.9:53 + 3 more...   45h
[root@master ~]# kubectl get svc -n kube-system
NAME                                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
coredns                               ClusterIP   10.233.0.3      <none>        53/UDP,53/TCP,9153/TCP   52d
kube-scheduler-prometheus-discovery   ClusterIP   None            <none>        10259/TCP                36d
kube-state-metrics                    ClusterIP   10.233.15.152   <none>        8080/TCP                 47d

[root@master ~]# kubectl get svc coredns -o yaml  -n kube-system
apiVersion: v1
kind: Service
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"prometheus.io/port":"9153","prometheus.io/scrape":"true"},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile","k8s-app":"kube-dns","kubernetes.io/cluster-service":"true","kubernetes.io/name":"coredns"},"name":"coredns","namespace":"kube-system"},"spec":{"clusterIP":"10.233.0.3","ports":[{"name":"dns","port":53,"protocol":"UDP"},{"name":"dns-tcp","port":53,"protocol":"TCP"},{"name":"metrics","port":9153,"protocol":"TCP"}],"selector":{"k8s-app":"kube-dns"}}}
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"

# 重新打标仅抓取到的具有 "prometheus.io/scrape: true" 的annotation的端点,意思是说如果某个service具有prometheus.io/scrape = true annotation声明则抓取,annotation本身也是键值结构,所以这里的源标签设置为键,而regex设置值true,当值匹配到regex设定的内容时则执行keep动作也就是保留,其余则丢弃。 

[root@master ~]# kubectl get svc coredns  -n kube-system -o yaml
apiVersion: v1
kind: Service
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"prometheus.io/port":"9153","prometheus.io/scrape":"true"},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile","k8s-app":"kube-dns","kubernetes.io/cluster-service":"true","kubernetes.io/name":"coredns"},"name":"coredns","namespace":"kube-system"},"spec":{"clusterIP":"10.233.0.3","ports":[{"name":"dns","port":53,"protocol":"UDP"},{"name":"dns-tcp","port":53,"protocol":"TCP"},{"name":"metrics","port":9153,"protocol":"TCP"}],"selector":{"k8s-app":"kube-dns"}}}
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
[root@master prometheus]# curl  10.233.90.9:9153/metrics | head -n 10
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP coredns_build_info A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built.
# TYPE coredns_build_info gauge
coredns_build_info{goversion="go1.14.1",revision="1766568",version="1.6.9"} 1
# HELP coredns_cache_hits_total The count of cache hits.
# TYPE coredns_cache_hits_total counter
coredns_cache_hits_total{server="dns://:53",type="denial"} 1
# HELP coredns_cache_misses_total The count of cache misses.
# TYPE coredns_cache_misses_total counter
coredns_cache_misses_total{server="dns://:53"} 6
# HELP coredns_cache_size The number of elements in the cache.
100 12115    0 12115    0     0  4249k      0 --:--:-- --:--:-- --:--:-- 5915k
curl: (23) Failed writing body (123 != 2048)

可以看到通过服务发现endpoints角色也能抓取到CoreDns暴露的据!!!!!!!!!!!!!

Pormetheus 完整yaml文件


prometheus配置文件 

[root@master prometheus]# cat prometheus-cfg.yaml 
---
kind: ConfigMap
apiVersion: v1
metadata:
  labels:
    app: prometheus
  name: prometheus-config
  namespace: monitor
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      scrape_timeout: 10s
      evaluation_interval: 1m
    scrape_configs:
    - job_name: 'kubernetes-node'
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - source_labels: [__address__]
        regex: '(.*):10250'
        replacement: '${1}:9100'
        target_label: __address__
        action: replace
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
    - job_name: 'kubernetes-node-cadvisor'
      kubernetes_sd_configs:
      - role:  node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    - job_name: 'kubernetes-apiserver'
      kubernetes_sd_configs:
      - role: endpoints
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        action: keep
        regex: default;kubernetes;https
    - job_name: 'kubernetes-service-endpoints'
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
        action: replace
        target_label: __scheme__
        regex: (https?)
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        target_label: kubernetes_name 

prometheus deploy文件

[root@master prometheus]# cat prometheus-deploy.yaml 
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-server
  namespace: monitor
  labels:
    app: prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
      component: server
    #matchExpressions:
    #- {key: app, operator: In, values: [prometheus]}
    #- {key: component, operator: In, values: [server]}
  template:
    metadata:
      labels:
        app: prometheus
        component: server
      annotations:
        prometheus.io/scrape: 'false'
    spec:
      serviceAccountName: monitor
      containers:
      - name: prometheus
        image: prom/prometheus:v2.2.1
        imagePullPolicy: IfNotPresent
        command:
          - prometheus
          - --config.file=/etc/prometheus/prometheus.yml
          - --storage.tsdb.path=/prometheus
          - --storage.tsdb.retention=720h
          - --web.enable-lifecycle
        ports:
        - containerPort: 9090
          protocol: TCP
        volumeMounts:
        - mountPath: /etc/prometheus/prometheus.yml
          name: prometheus-config
          subPath: prometheus.yml
        - mountPath: /prometheus/
          name: prometheus-storage-volume
      volumes:
        - name: prometheus-config
          configMap:
            name: prometheus-config
            items:
              - key: prometheus.yml
                path: prometheus.yml
                mode: 0644
        - name: prometheus-storage-volume
          persistentVolumeClaim:
            claimName: prometheus
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: prometheus 
  namespace: monitor
spec:
  storageClassName: "managed-nfs-storage"
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
You have new mail in /var/spool/mail/root

prometheus server不被抓取到。 

        prometheus.io/scrape: 'false'

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐