部署alertmanager

helm 部署prometheus 及其周边,其他都正常部署,但是部署alertmanager Chart ,prometheus server 启动不起来
在这里插入图片描述
报:field alertmanagers not found in type config.ScrapeConfig

# kubectl  logs prometheus-prometheus-server-59b8b67dfc-c6cbl prometheus-server
level=error ts=2023-01-11T02:32:51.178Z caller=main.go:290 msg="Error loading config (--config.file=/etc/config/prometheus.yml)" err="parsing YAML file /etc/config/prometheus.yml: yaml: unmarshal errors:\n  line 194: cannot unmarshal !!map into string\n  line 195: field alertmanagers not found in type config.ScrapeConfig"

选择不用helm 发布 alertmanager ,单独部署 alertmanager pod试试
在这里插入图片描述
参考helm的templates 的alertmanager-deployment.yaml

  • prometheus-alertmanager-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: alertmanager
  labels:
    k8s-app: alertmanager
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: alertmanager
  template:
    metadata:
      labels:
        k8s-app: alertmanager
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      priorityClassName: system-cluster-critical
      containers:
        - name: prometheus-alertmanager
          image: "k8s-docker-registry-node:5000/alertmanager:v0.13.0"
          imagePullPolicy: "IfNotPresent"
          args:
            - --config.file=/etc/config/alertmanager.yml
            - --storage.path=/data
            - --web.external-url=/
          ports:
            - containerPort: 9093
          readinessProbe:
            httpGet:
              path: /#/status
              port: 9093
            initialDelaySeconds: 30
            timeoutSeconds: 30
             volumeMounts:
            - name: config-volume
              mountPath: /etc/config
            - name: storage-volume
              mountPath: "/data"
          resources:
            limits:
              cpu: 10m
              memory: 50Mi
            requests:
              cpu: 10m
              memory: 50Mi
        - name: prometheus-alertmanager-configmap-reload
          image: "k8s-docker-registry-node:5000/configmap-reload:v0.1"
          imagePullPolicy: "IfNotPresent"
          args:
            - --volume-dir=/etc/config
            - --webhook-url=http://localhost:9093/-/reload
          volumeMounts:
            - name: config-volume
              mountPath: /etc/config
              readOnly: true
          resources:
            limits:
              cpu: 10m
              memory: 10Mi
            requests:
              cpu: 10m
              memory: 10Mi
      volumes:
        - name: config-volume
          configMap:
            name: alertmanager-configmap
        - name: storage-volume
          emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus-alertmanager-service
  labels:
    app: prometheus-alertmanager-service
spec:
  ports:
    - port: 9093
      nodePort: 9093
      targetPort: 9093
      name: prometheus-alertmanager-port
  selector:
    k8s-app: alertmanager
  type: NodePort          
  • alertmanager-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: alertmanager-configmap
  labels:
    app: alertmanager-configmap
data:
  alertmanager.yml: |-
    global:
      resolve_timeout: 1m
    receivers:
    - name: default-receiver
      webhook_configs:
      ### 配置kafka webhook
      - url: 'http://alertmanager-kafka-forwarder-service:9792/alert'
        send_resolved: true
    route:
      group_wait: 10s
      group_interval: 5m
      receiver: default-receiver
      repeat_interval: 3h
  • rules-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: rules-configmap
  labels:
    app: rules-configmap
data:
  node_rule.yml: |-
    groups:
    - name: node-rules
      rules:
      - alert: node-up
        expr: up == 0
        for: 15s
        labels:
          severity: 1
          team: node
        annotations:
          summary: Summary
          description: description
      - alert: NodeMemoryUsage
        expr: 100 - (node_memory_MemFree + node_memory_Cached + node_memory_Buffers) / node_memory_MemTotal * 100 > 60
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Instance {{ $labels.instance }} 内存使用率过高"
          description: "{{ $labels.instance }}内存使用大于60% (当前值: {{ $value }})"

prometheus.yaml 配置 alertmanager

prometheus.yml:
    ...
    rule_files:
      - "/prometheus/rules/node_rule.yml"

    alerting:   #配置Alertmanager相关信息
      alertmanagers:
      - static_configs:
        - targets: ['prometheus-alertmanager-service:9093']
    ...

在这里插入图片描述

alertmanager 写入kafka

参考开源项目:https://github.com/insani4c/alertmanager-kafka-forwarder

apiVersion: v1
kind: ReplicationController
metadata:
  name: alertmanager-kafka-forwarder-deployment
  labels:
    name: alertmanager-kafka-forwarder-deployment
spec:
  replicas: 1
  selector:
    name: alertmanager-kafka-forwarder-deployment
  template:
    metadata:
      labels:
        name: alertmanager-kafka-forwarder-deployment
    spec:
      containers:
      - name: alertmanager-kafka-forwarder-deployment
        image: k8s-docker-registry-node:5000/alertmanager-kafka-forwarder:main
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9792
        env:
        - name: "TZ"
          value: "Asia/Shanghai"
        - name: "BOOTSTRAP_SERVERS"
          value: "kafka-headless:9092"
        - name: "FLASK_SECRET_KEY"
          value: "123456"
        - name: "KAFKA_TOPIC"
          value: "alertmanager-events"
        #command: ["/bin/bash", "-c", " sleep infinity"]
      restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: alertmanager-kafka-forwarder-service
  name: alertmanager-kafka-forwarder-service
spec:
  type: NodePort
  ports:
  - port: 9792
    targetPort: 9792
    nodePort: 9792
    name: alertmanager-kafka-forwarder-port
  selector:
    name: alertmanager-kafka-forwarder-deployment

在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐