ES8生产实践——Fleet部署与常见日志采集(ECK方式)
本篇为ELK Stack生产实践系列专题第十五篇,本篇主要内容是介绍如何在k8s环境中通过eck方式部署fleet server和elastic agent,以及通过配置集成策略完成对Linux、kubernetes、elasticsearch服务的日志与指标的采集与查询展示。
Fleet简介
Fleet概述
为了更加方便的实现系统和应用程序日志接入ES,官方推出了Elastic Agent采集方案。相较于之前使用beats采集数据,Elastic Agent可以实现通过更少的配置和安装来简化数据采集配置,仅需一个Elastic Agent代理即可实现日志、指标、APM 跟踪信息的采集,通过Fleet可以轻松的管理整个Elastic Agent队列。Kibana为我们内置了大多数场景下日志的采集与可视化分析配置,我们仅需要在kibanaUI中点击操作便可完成复杂的日志采集。
参考文档
更多fleet详细内容介绍,可参考文档:https://www.cuiliangblog.cn/detail/section/133432981
使用rpm方式部署fleet请参考文档:https://www.cuiliangblog.cn/detail/article/61
fleet方式采集日志架构
我们只需要在k8s集群中部署一个fleet server服务,在每个k8s节点使用DaemonSet方式部署elastic agent,并将k8s节点的/var/log目录挂载到elastic agent容器即可。后续日志与指标采集均可在kibana图形界面中配置与管理,大大简化了应用程序日志接入ES的工作。k8s中使用fleet采集日志架构如下所示:
环境说明
由于演示环境资源有限,原本master1和master2节点运行各运行两个pod后,再运行elastic agent时内存资源紧张,因此改为单pod方式演示,现集群节点与角色如下:
主机名 | IP | 主机配置 | k8s用途 | ELK节点角色 |
---|---|---|---|---|
master1 | 192.168.10.151 | 2C8G | control-plane | hot1、Elastic Agent |
master2 | 192.168.10.152 | 2C8G | control-plane | hot2、Elastic Agent |
master3 | 192.168.10.153 | 2C8G | control-plane | warm1、Elastic Agent |
work1 | 192.168.10.154 | 2C8G | work | warm2、master1、Elastic Agent |
work2 | 192.168.10.155 | 2C8G | work | cold、master2、Elastic Agent |
work3 | 192.168.10.156 | 2C8G | work | master3、Elastic Agent |
kibana和fleet-server为无状态服务,由kube-scheduler自动调度至合适的节点运行。
资源清单介绍
rbac
此处创建了两个ServiceAccount,分别是fleet-server和elastic-agent,并创建相应权限的ClusterRole与其绑定,保障可以有足够的权限采集k8s相关指标信息。
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fleet-server
rules:
- apiGroups: [""]
resources: ["*"]
verbs:
- get
- watch
- list
- apiGroups: ["coordination.k8s.io"]
resources:
- leases
verbs:
- get
- create
- update
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: fleet-server
namespace: elk
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: fleet-server
subjects:
- kind: ServiceAccount
name: fleet-server
namespace: elk
roleRef:
kind: ClusterRole
name: fleet-server
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: elastic-agent
rules:
- apiGroups: [""]
resources: ["*"]
verbs:
- get
- watch
- list
- apiGroups: ["coordination.k8s.io"]
resources:
- leases
verbs:
- get
- create
- update
- nonResourceURLs:
- "/metrics"
verbs:
- get
- apiGroups: ["extensions"]
resources:
- replicasets
verbs:
- "get"
- "list"
- "watch"
- apiGroups:
- "apps"
resources:
- statefulsets
- deployments
- replicasets
verbs:
- "get"
- "list"
- "watch"
- apiGroups:
- ""
resources:
- nodes/stats
verbs:
- get
- apiGroups:
- "batch"
resources:
- jobs
verbs:
- "get"
- "list"
- "watch"
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: elastic-agent
namespace: elk
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: elastic-agent
subjects:
- kind: ServiceAccount
name: elastic-agent
namespace: elk
roleRef:
kind: ClusterRole
name: elastic-agent
apiGroup: rbac.authorization.k8s.io
fleet-server
fleet-server使用eck的CRD资源部署,关联kibana和elasticsearch资源。指定serviceAccount为前面创建的fleet-server,并将sa的token资源挂载到 Pod 中,用于连接验证k8s资源使用。
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
name: fleet-server
namespace: elk
spec:
version: 8.9.1
image: harbor.local.com/elk/elastic-agent:8.9.1
kibanaRef:
name: kibana
elasticsearchRefs:
- name: elasticsearch
mode: fleet
fleetServerEnabled: true
policyID: eck-fleet-server
deployment:
replicas: 1
podTemplate:
spec:
serviceAccountName: fleet-server
automountServiceAccountToken: true
securityContext:
runAsUser: 0
elastic-agent
elastic-agent使用eck的CRD资源部署,并关联kibana和elasticsearch资源。指定serviceAccount为前面创建的elastic-agent,并将sa的token资源挂载到 Pod 中,用于连接验证k8s资源使用。并挂载宿主机的/var/log目录用于采集日志,挂载es的ca.crt资源用户连接ES服务验证,网络模式使用hostNetwork以便于访问kube-proxy服务,以及fleet管理菜单能正确显示主机名。
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
name: elastic-agent
namespace: elk
spec:
version: 8.9.1
image: harbor.local.com/elk/elastic-agent:8.9.1
kibanaRef:
name: kibana
fleetServerRef:
name: fleet-server
mode: fleet
policyID: eck-agent
daemonSet:
podTemplate:
spec:
serviceAccountName: elastic-agent
automountServiceAccountToken: true
securityContext:
runAsUser: 0
containers:
- name: agent
volumeMounts:
- mountPath: /var/log
name: log-dir
- mountPath: /etc/es-http-certs
name: es-http-certs
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
volumes:
- name: log-dir
hostPath:
path: /var/log
- name: es-http-certs
secret:
secretName: elasticsearch-es-http-certs-public
es-log4j2
docker方式运行的elasticsearch日志默认使用的console输出, 不会记录到日志文件中, logs目录下面只有gc.log,因此我们可以需要修改配置log4j2设置,将日志写入对应文件中。方便elastic-agent挂载日志目录收集elasticsearch服务日志。
apiVersion: v1
kind: ConfigMap
metadata:
name: es-log4j2
namespace: elk
data:
log4j2.properties : |-
status = error
appender.console.type = Console
appender.console.name = console
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %m%consoleException%n
######## Server JSON ############################
appender.rolling.type = RollingFile
appender.rolling.name = rolling
appender.rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_server.json
appender.rolling.layout.type = ECSJsonLayout
appender.rolling.layout.dataset = elasticsearch.server
appender.rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}-%d{yyyy-MM-dd}-%i.json.gz
appender.rolling.policies.type = Policies
appender.rolling.policies.time.type = TimeBasedTriggeringPolicy
appender.rolling.policies.time.interval = 1
appender.rolling.policies.time.modulate = true
appender.rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.rolling.policies.size.size = 128MB
appender.rolling.strategy.type = DefaultRolloverStrategy
appender.rolling.strategy.fileIndex = nomax
appender.rolling.strategy.action.type = Delete
appender.rolling.strategy.action.basepath = ${sys:es.logs.base_path}
appender.rolling.strategy.action.condition.type = IfFileName
appender.rolling.strategy.action.condition.glob = ${sys:es.logs.cluster_name}-*
appender.rolling.strategy.action.condition.nested_condition.type = IfAccumulatedFileSize
appender.rolling.strategy.action.condition.nested_condition.exceeds = 2GB
################################################
######## Server - old style pattern ###########
appender.rolling_old.type = RollingFile
appender.rolling_old.name = rolling_old
appender.rolling_old.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}.log
appender.rolling_old.layout.type = PatternLayout
appender.rolling_old.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %m%n
appender.rolling_old.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}-%d{yyyy-MM-dd}-%i.log.gz
appender.rolling_old.policies.type = Policies
appender.rolling_old.policies.time.type = TimeBasedTriggeringPolicy
appender.rolling_old.policies.time.interval = 1
appender.rolling_old.policies.time.modulate = true
appender.rolling_old.policies.size.type = SizeBasedTriggeringPolicy
appender.rolling_old.policies.size.size = 128MB
appender.rolling_old.strategy.type = DefaultRolloverStrategy
appender.rolling_old.strategy.fileIndex = nomax
appender.rolling_old.strategy.action.type = Delete
appender.rolling_old.strategy.action.basepath = ${sys:es.logs.base_path}
appender.rolling_old.strategy.action.condition.type = IfFileName
appender.rolling_old.strategy.action.condition.glob = ${sys:es.logs.cluster_name}-*
appender.rolling_old.strategy.action.condition.nested_condition.type = IfAccumulatedFileSize
appender.rolling_old.strategy.action.condition.nested_condition.exceeds = 2GB
################################################
rootLogger.level = info
rootLogger.appenderRef.console.ref = console
rootLogger.appenderRef.rolling.ref = rolling
rootLogger.appenderRef.rolling_old.ref = rolling_old
######## Deprecation JSON #######################
appender.deprecation_rolling.type = RollingFile
appender.deprecation_rolling.name = deprecation_rolling
appender.deprecation_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_deprecation.json
appender.deprecation_rolling.layout.type = ECSJsonLayout
# Intentionally follows a different pattern to above
appender.deprecation_rolling.layout.dataset = deprecation.elasticsearch
appender.deprecation_rolling.filter.rate_limit.type = RateLimitingFilter
appender.deprecation_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_deprecation-%i.json.gz
appender.deprecation_rolling.policies.type = Policies
appender.deprecation_rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.deprecation_rolling.policies.size.size = 1GB
appender.deprecation_rolling.strategy.type = DefaultRolloverStrategy
appender.deprecation_rolling.strategy.max = 4
appender.header_warning.type = HeaderWarningAppender
appender.header_warning.name = header_warning
#################################################
logger.deprecation.name = org.elasticsearch.deprecation
logger.deprecation.level = WARN
logger.deprecation.appenderRef.deprecation_rolling.ref = deprecation_rolling
logger.deprecation.appenderRef.header_warning.ref = header_warning
logger.deprecation.additivity = false
######## Search slowlog JSON ####################
appender.index_search_slowlog_rolling.type = RollingFile
appender.index_search_slowlog_rolling.name = index_search_slowlog_rolling
appender.index_search_slowlog_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs\
.cluster_name}_index_search_slowlog.json
appender.index_search_slowlog_rolling.layout.type = ECSJsonLayout
appender.index_search_slowlog_rolling.layout.dataset = elasticsearch.index_search_slowlog
appender.index_search_slowlog_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs\
.cluster_name}_index_search_slowlog-%i.json.gz
appender.index_search_slowlog_rolling.policies.type = Policies
appender.index_search_slowlog_rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.index_search_slowlog_rolling.policies.size.size = 1GB
appender.index_search_slowlog_rolling.strategy.type = DefaultRolloverStrategy
appender.index_search_slowlog_rolling.strategy.max = 4
#################################################
#################################################
logger.index_search_slowlog_rolling.name = index.search.slowlog
logger.index_search_slowlog_rolling.level = trace
logger.index_search_slowlog_rolling.appenderRef.index_search_slowlog_rolling.ref = index_search_slowlog_rolling
logger.index_search_slowlog_rolling.additivity = false
######## Indexing slowlog JSON ##################
appender.index_indexing_slowlog_rolling.type = RollingFile
appender.index_indexing_slowlog_rolling.name = index_indexing_slowlog_rolling
appender.index_indexing_slowlog_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}\
_index_indexing_slowlog.json
appender.index_indexing_slowlog_rolling.layout.type = ECSJsonLayout
appender.index_indexing_slowlog_rolling.layout.dataset = elasticsearch.index_indexing_slowlog
appender.index_indexing_slowlog_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}\
_index_indexing_slowlog-%i.json.gz
appender.index_indexing_slowlog_rolling.policies.type = Policies
appender.index_indexing_slowlog_rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.index_indexing_slowlog_rolling.policies.size.size = 1GB
appender.index_indexing_slowlog_rolling.strategy.type = DefaultRolloverStrategy
appender.index_indexing_slowlog_rolling.strategy.max = 4
#################################################
logger.index_indexing_slowlog.name = index.indexing.slowlog.index
logger.index_indexing_slowlog.level = trace
logger.index_indexing_slowlog.appenderRef.index_indexing_slowlog_rolling.ref = index_indexing_slowlog_rolling
logger.index_indexing_slowlog.additivity = false
logger.org_apache_pdfbox.name = org.apache.pdfbox
logger.org_apache_pdfbox.level = off
logger.org_apache_poi.name = org.apache.poi
logger.org_apache_poi.level = off
logger.org_apache_fontbox.name = org.apache.fontbox
logger.org_apache_fontbox.level = off
logger.org_apache_xmlbeans.name = org.apache.xmlbeans
logger.org_apache_xmlbeans.level = off
logger.com_amazonaws.name = com.amazonaws
logger.com_amazonaws.level = warn
logger.com_amazonaws_jmx_SdkMBeanRegistrySupport.name = com.amazonaws.jmx.SdkMBeanRegistrySupport
logger.com_amazonaws_jmx_SdkMBeanRegistrySupport.level = error
logger.com_amazonaws_metrics_AwsSdkMetrics.name = com.amazonaws.metrics.AwsSdkMetrics
logger.com_amazonaws_metrics_AwsSdkMetrics.level = error
logger.com_amazonaws_auth_profile_internal_BasicProfileConfigFileLoader.name = com.amazonaws.auth.profile.internal.BasicProfileConfigFileLoader
logger.com_amazonaws_auth_profile_internal_BasicProfileConfigFileLoader.level = error
logger.com_amazonaws_services_s3_internal_UseArnRegionResolver.name = com.amazonaws.services.s3.internal.UseArnRegionResolver
logger.com_amazonaws_services_s3_internal_UseArnRegionResolver.level = error
appender.audit_rolling.type = RollingFile
appender.audit_rolling.name = audit_rolling
appender.audit_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_audit.json
appender.audit_rolling.layout.type = PatternLayout
appender.audit_rolling.layout.pattern = {\
"type":"audit", \
"timestamp":"%d{yyyy-MM-dd'T'HH:mm:ss,SSSZ}"\
%varsNotEmpty{, "cluster.name":"%enc{%map{cluster.name}}{JSON}"}\
%varsNotEmpty{, "cluster.uuid":"%enc{%map{cluster.uuid}}{JSON}"}\
%varsNotEmpty{, "node.name":"%enc{%map{node.name}}{JSON}"}\
%varsNotEmpty{, "node.id":"%enc{%map{node.id}}{JSON}"}\
%varsNotEmpty{, "host.name":"%enc{%map{host.name}}{JSON}"}\
%varsNotEmpty{, "host.ip":"%enc{%map{host.ip}}{JSON}"}\
%varsNotEmpty{, "event.type":"%enc{%map{event.type}}{JSON}"}\
%varsNotEmpty{, "event.action":"%enc{%map{event.action}}{JSON}"}\
%varsNotEmpty{, "authentication.type":"%enc{%map{authentication.type}}{JSON}"}\
%varsNotEmpty{, "user.name":"%enc{%map{user.name}}{JSON}"}\
%varsNotEmpty{, "user.run_by.name":"%enc{%map{user.run_by.name}}{JSON}"}\
%varsNotEmpty{, "user.run_as.name":"%enc{%map{user.run_as.name}}{JSON}"}\
%varsNotEmpty{, "user.realm":"%enc{%map{user.realm}}{JSON}"}\
%varsNotEmpty{, "user.realm_domain":"%enc{%map{user.realm_domain}}{JSON}"}\
%varsNotEmpty{, "user.run_by.realm":"%enc{%map{user.run_by.realm}}{JSON}"}\
%varsNotEmpty{, "user.run_by.realm_domain":"%enc{%map{user.run_by.realm_domain}}{JSON}"}\
%varsNotEmpty{, "user.run_as.realm":"%enc{%map{user.run_as.realm}}{JSON}"}\
%varsNotEmpty{, "user.run_as.realm_domain":"%enc{%map{user.run_as.realm_domain}}{JSON}"}\
%varsNotEmpty{, "user.roles":%map{user.roles}}\
%varsNotEmpty{, "apikey.id":"%enc{%map{apikey.id}}{JSON}"}\
%varsNotEmpty{, "apikey.name":"%enc{%map{apikey.name}}{JSON}"}\
%varsNotEmpty{, "authentication.token.name":"%enc{%map{authentication.token.name}}{JSON}"}\
%varsNotEmpty{, "authentication.token.type":"%enc{%map{authentication.token.type}}{JSON}"}\
%varsNotEmpty{, "cross_cluster_access":%map{cross_cluster_access}}\
%varsNotEmpty{, "origin.type":"%enc{%map{origin.type}}{JSON}"}\
%varsNotEmpty{, "origin.address":"%enc{%map{origin.address}}{JSON}"}\
%varsNotEmpty{, "realm":"%enc{%map{realm}}{JSON}"}\
%varsNotEmpty{, "realm_domain":"%enc{%map{realm_domain}}{JSON}"}\
%varsNotEmpty{, "url.path":"%enc{%map{url.path}}{JSON}"}\
%varsNotEmpty{, "url.query":"%enc{%map{url.query}}{JSON}"}\
%varsNotEmpty{, "request.method":"%enc{%map{request.method}}{JSON}"}\
%varsNotEmpty{, "request.body":"%enc{%map{request.body}}{JSON}"}\
%varsNotEmpty{, "request.id":"%enc{%map{request.id}}{JSON}"}\
%varsNotEmpty{, "action":"%enc{%map{action}}{JSON}"}\
%varsNotEmpty{, "request.name":"%enc{%map{request.name}}{JSON}"}\
%varsNotEmpty{, "indices":%map{indices}}\
%varsNotEmpty{, "opaque_id":"%enc{%map{opaque_id}}{JSON}"}\
%varsNotEmpty{, "trace.id":"%enc{%map{trace.id}}{JSON}"}\
%varsNotEmpty{, "x_forwarded_for":"%enc{%map{x_forwarded_for}}{JSON}"}\
%varsNotEmpty{, "transport.profile":"%enc{%map{transport.profile}}{JSON}"}\
%varsNotEmpty{, "rule":"%enc{%map{rule}}{JSON}"}\
%varsNotEmpty{, "put":%map{put}}\
%varsNotEmpty{, "delete":%map{delete}}\
%varsNotEmpty{, "change":%map{change}}\
%varsNotEmpty{, "create":%map{create}}\
%varsNotEmpty{, "invalidate":%map{invalidate}}\
}%n
appender.audit_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_audit-%d{yyyy-MM-dd}-%i.json.gz
appender.audit_rolling.policies.type = Policies
appender.audit_rolling.policies.time.type = TimeBasedTriggeringPolicy
appender.audit_rolling.policies.time.interval = 1
appender.audit_rolling.policies.time.modulate = true
appender.audit_rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.audit_rolling.policies.size.size = 1GB
appender.audit_rolling.strategy.type = DefaultRolloverStrategy
appender.audit_rolling.strategy.fileIndex = nomax
logger.xpack_security_audit_logfile.name = org.elasticsearch.xpack.security.audit.logfile.LoggingAuditTrail
logger.xpack_security_audit_logfile.level = info
logger.xpack_security_audit_logfile.appenderRef.audit_rolling.ref = audit_rolling
logger.xpack_security_audit_logfile.additivity = false
logger.xmlsig.name = org.apache.xml.security.signature.XMLSignature
logger.xmlsig.level = error
logger.samlxml_decrypt.name = org.opensaml.xmlsec.encryption.support.Decrypter
logger.samlxml_decrypt.level = fatal
logger.saml2_decrypt.name = org.opensaml.saml.saml2.encryption.Decrypter
logger.saml2_decrypt.level = fatal
elasticsearch
在elasticsearch资源的每个节点都配置名为elasticsearch-logs的存储资源(名称必须为elasticsearch-logs,否则eck无法正常识别),类型为hostPath,指定路径为宿主机的/var/log/elasticsearch目录下。并挂载log4j2配置文件,替换默认的日志配置。
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
namespace: elk
spec:
version: 8.9.1
image: harbor.local.com/elk/elasticsearch:8.9.1
secureSettings:
- secretName: snapshot-settings
nodeSets:
- name: master
count: 3
config:
node.roles: ["master", "ingest", "remote_cluster_client", "transform"]
node.store.allow_mmap: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: local-storage
podTemplate:
spec:
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
- name: ES_SETTING_REINDEX_REMOTE_WHITELIST
value: "192.168.10.100:9200"
- name: ES_SETTING_REINDEX_SSL_VERIFICATION__MODE
value: "none"
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
volumeMounts:
- name: es-log4j2
mountPath: /usr/share/elasticsearch/config/log4j2.properties
subPath: log4j2.properties
volumes:
- name: elasticsearch-logs
hostPath:
path: /var/log/elasticsearch
type: DirectoryOrCreate
- name: es-log4j2
configMap:
name: es-log4j2
- name: hot
count: 2
config:
node.roles: [ "data_content", "data_hot", "remote_cluster_client"]
node.store.allow_mmap: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: local-storage
podTemplate:
spec:
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
- name: ES_SETTING_REINDEX_REMOTE_WHITELIST
value: "192.168.10.100:9200"
- name: ES_SETTING_REINDEX_SSL_VERIFICATION__MODE
value: "none"
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
volumeMounts:
- name: es-log4j2
mountPath: /usr/share/elasticsearch/config/log4j2.properties
subPath: log4j2.properties
volumes:
- name: elasticsearch-logs
hostPath:
path: /var/log/elasticsearch
type: DirectoryOrCreate
- name: es-log4j2
configMap:
name: es-log4j2
- name: warm
count: 2
config:
node.roles: [ "data_content", "data_warm", "remote_cluster_client"]
node.store.allow_mmap: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 600Gi
storageClassName: local-storage
podTemplate:
spec:
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
- name: ES_SETTING_REINDEX_REMOTE_WHITELIST
value: "192.168.10.100:9200"
- name: ES_SETTING_REINDEX_SSL_VERIFICATION__MODE
value: "none"
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
volumeMounts:
- name: es-log4j2
mountPath: /usr/share/elasticsearch/config/log4j2.properties
subPath: log4j2.properties
volumes:
- name: elasticsearch-logs
hostPath:
path: /var/log/elasticsearch
type: DirectoryOrCreate
- name: es-log4j2
configMap:
name: es-log4j2
- name: cold
count: 1
config:
node.roles: [ "data_content", "data_cold", "remote_cluster_client"]
node.store.allow_mmap: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 800Gi
storageClassName: local-storage
podTemplate:
spec:
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
- name: ES_SETTING_REINDEX_REMOTE_WHITELIST
value: "192.168.10.100:9200"
- name: ES_SETTING_REINDEX_SSL_VERIFICATION__MODE
value: "none"
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
volumeMounts:
- name: es-log4j2
mountPath: /usr/share/elasticsearch/config/log4j2.properties
subPath: log4j2.properties
volumes:
- name: elasticsearch-logs
hostPath:
path: /var/log/elasticsearch
type: DirectoryOrCreate
- name: es-log4j2
configMap:
name: es-log4j2
kibana
在kibana中关联elasticsearch资源,并指定elasticsearch和fleet_server的svc地址。声明在kibana中默认安装system、elastic_agent、fleet_server、elasticsearch、kubernetes这些集成策略。
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: kibana
namespace: elk
spec:
version: 8.9.1
image: harbor.local.com/elk/kibana:8.9.1
count: 1
elasticsearchRef:
name: elasticsearch
config:
xpack.fleet.agents.elasticsearch.hosts: ["https://elasticsearch-es-http.elk.svc:9200"]
xpack.fleet.agents.fleet_server.hosts: ["https://fleet-server-agent-http.elk.svc:8220"]
xpack.fleet.packages:
- name: system
version: latest
- name: elastic_agent
version: latest
- name: fleet_server
version: latest
- name: elasticsearch
version: latest
- name: kubernetes
version: latest
xpack.fleet.agentPolicies:
- name: Fleet Server on ECK policy
id: eck-fleet-server
namespace: elk
monitoring_enabled:
- logs
- metrics
unenroll_timeout: 900
package_policies:
- package:
name: fleet_server
name: fleet_server
- name: Elastic Agent on ECK policy
id: eck-agent
namespace: elk
monitoring_enabled:
- logs
- metrics
unenroll_timeout: 900
package_policies:
- package:
name: system
name: system
- package:
name: elastic_agent
name: elastic_agent
- package:
name: elasticsearch
name: elasticsearch
- package:
name: kubernetes
name: kubernetes
podTemplate:
spec:
containers:
- name: kibana
env:
- name: NODE_OPTIONS
value: "--max-old-space-size=2048"
- name: SERVER_PUBLICBASEURL
value: "https://kibana.local.com"
- name: I18N_LOCALE
value: "zh-CN"
resources:
requests:
memory: 1Gi
cpu: 0.5
limits:
memory: 2Gi
cpu: 2
部署资源并验证
elasticsearch日志目录权限配置
由于elasticsearch容器运行用户为elasticsearch,uid为1000,hostPath挂载后会出现权限不足的问题,需要提前修改宿主机目录权限。更多k8s安全上下文信息参考文档:https://www.cuiliangblog.cn/detail/section/126523774
# mkdir /var/log/elasticsearch
# useradd -u 1000 elasticsearch
# chown elasticsearch:elasticsearch /var/log/elasticsearch
创建资源
[root@tiaoban fleet]# kubectl apply -f .
agent.agent.k8s.elastic.co/elastic-agent created
elasticsearch.elasticsearch.k8s.elastic.co/elasticsearch created
configmap/es-log4j2 created
agent.agent.k8s.elastic.co/fleet-server created
serverstransport.traefik.containo.us/elasticsearch-transport unchanged
ingressroute.traefik.containo.us/elasticsearch unchanged
serverstransport.traefik.containo.us/kibana-transport unchanged
ingressroute.traefik.containo.us/kibana unchanged
kibana.kibana.k8s.elastic.co/kibana created
persistentvolume/es-master-pv1 created
persistentvolume/es-master-pv2 created
persistentvolume/es-master-pv3 created
persistentvolume/es-hot-pv1 created
persistentvolume/es-hot-pv2 created
persistentvolume/es-warm-pv1 created
persistentvolume/es-warm-pv2 created
persistentvolume/es-cold-pv1 created
persistentvolumeclaim/elasticsearch-data-elasticsearch-es-master-0 created
persistentvolumeclaim/elasticsearch-data-elasticsearch-es-master-1 created
persistentvolumeclaim/elasticsearch-data-elasticsearch-es-master-2 created
persistentvolumeclaim/elasticsearch-data-elasticsearch-es-hot-0 created
persistentvolumeclaim/elasticsearch-data-elasticsearch-es-hot-1 created
persistentvolumeclaim/elasticsearch-data-elasticsearch-es-warm-0 created
persistentvolumeclaim/elasticsearch-data-elasticsearch-es-warm-1 created
persistentvolumeclaim/elasticsearch-data-elasticsearch-es-cold-0 created
clusterrole.rbac.authorization.k8s.io/fleet-server created
serviceaccount/fleet-server created
clusterrolebinding.rbac.authorization.k8s.io/fleet-server created
clusterrole.rbac.authorization.k8s.io/elastic-agent created
serviceaccount/elastic-agent created
clusterrolebinding.rbac.authorization.k8s.io/elastic-agent created
storageclass.storage.k8s.io/local-storage created
service/elasticsearch-nodeport created
service/kibana-nodeport created
查看资源信息
通过观察可知,pod创建顺序依次是elasticsearch和kibana,待服务正常后才会继续创建fleet-server资源,最后在每个节点创建elastic-agent资源
[root@tiaoban fleet]# kubectl get pod -n elk
NAME READY STATUS RESTARTS AGE
elastic-agent-agent-66hnd 1/1 Running 0 16s
elastic-agent-agent-b685g 1/1 Running 0 16s
elastic-agent-agent-jdxx9 1/1 Running 0 16s
elastic-agent-agent-m6stj 1/1 Running 0 16s
elastic-agent-agent-nlp2t 1/1 Running 0 16s
elastic-agent-agent-wxskg 1/1 Running 0 16s
elasticsearch-es-cold-0 1/1 Running 0 7m34s
elasticsearch-es-hot-0 1/1 Running 0 7m35s
elasticsearch-es-hot-1 1/1 Running 0 7m35s
elasticsearch-es-master-0 1/1 Running 0 7m35s
elasticsearch-es-master-1 1/1 Running 0 7m35s
elasticsearch-es-master-2 1/1 Running 0 7m35s
elasticsearch-es-warm-0 1/1 Running 0 7m34s
elasticsearch-es-warm-1 1/1 Running 0 7m34s
fleet-server-agent-65756b65f8-dmj5s 1/1 Running 0 94s
kibana-kb-5f4c67c676-mpt7d 1/1 Running 0 7m28s
[root@tiaoban fleet]# kubectl get elasticsearch -n elk
NAME HEALTH NODES VERSION PHASE AGE
elasticsearch green 8 8.9.1 Ready 8m5s
[root@tiaoban fleet]# kubectl get kibana -n elk
NAME HEALTH NODES VERSION AGE
kibana green 1 8.9.1 8m13s
[root@tiaoban fleet]# kubectl get agent -n elk
NAME HEALTH AVAILABLE EXPECTED VERSION AGE
elastic-agent green 6 6 8.9.1 8m39s
fleet-server green 1 1 8.9.1 8m39s
待所有资源都正常创建且状态为green后,访问kibana。
验证
查看fleet信息,发现fleet-server和elastic-agent均已正常运行。
查看已安装的集成,已为我们默认安装了system、elastic_agent、fleet_server、elasticsearch、kubernetes这些集成资源。
集成策略配置
当我们发现集成某些集成策略虽然已安装,但是数据无法正常采集时,就需要修改默认的集成策略配置,重新配置采集的指标项和日志路径。
配置完成后,我们点击资产标签即可查看为我们内置的各种dashboard和查询视图
系统日志
查看system集成配置可知系统日志默认采集的是/var/log/secure和/var/log/messages等日志内容,但RHEL8之后默认不安装rsyslog服务,导致/var/log/secure日志不存在,此时我们需要安装rsyslog服务并设置开机自启。
dnf -y install rsyslog
systemctl enable rsyslog --now
接下来点击集成菜单的资产,查看kibana为我们内置的dashboard和discover
elasticsearch指标日志
elasticsearch指标地址默认为http://127.0.0.1:9200,需要改为集群内的https地址,并指定账号密码和证书
elasticsearch日志查看:我们依次点击kibana的堆栈监测——>logs即可查看。
elasticsearch指标查看,以ingest pipeline dashboard为例,内容如下:
kubernetes指标日志
kubernetes指标采集依赖kube-state-metrics组件,可单独部署metrics组件,参考文档:https://www.cuiliangblog.cn/detail/section/15189166。也可以使用kube-prometheus部署,参考文档:https://www.cuiliangblog.cn/detail/section/15189202,部署后修改启动参数hots为0.0.0.0
spec:
template:
spec:
containers:
- args:
- --host=0.0.0.0
- --port=8081
将默认的metrics地址改为kube-state-metrics..svc格式端口为8081。
kubernetes日志
kubernetes指标
完整资源清单
本实验案例所有yaml文件已上传至git仓库。访问地址如下:
github
https://github.com/cuiliang0302/blog-demo
gitee
https://gitee.com/cuiliang0302/blog_demo
参考文档
在eck中部署fleet:https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-elastic-agent-fleet.html
log4j2配置:https://www.elastic.co/guide/en/elasticsearch/reference/8.9/logging.html
yaml配置fleet policy:https://www.elastic.co/guide/en/kibana/current/fleet-settings-kb.html
查看更多
微信公众号
微信公众号同步更新,欢迎关注微信公众号《崔亮的博客》第一时间获取最近文章。
博客网站
崔亮的博客-专注devops自动化运维,传播优秀it运维技术文章。更多原创运维开发相关文章,欢迎访问https://www.cuiliangblog.cn
更多推荐
所有评论(0)