【K8S 七】Metrics Server部署中的问题
前面在使用kubeadm工具部署K8S时,做过Metrics的部署,过程很简单。后来在生产上使用二进制方式部署K8S后,创建Metrics插件却屡屡遇坑,此处记录一下填坑过程。部署步骤请参考《》为了更方便厘清问题,先上一张拓扑图(flanneld网络插件可以换成calico)
目录
附件:kube-metric-server.yaml启动文件
前面在使用kubeadm工具部署K8S时,做过Metrics的部署,过程很简单。后来在生产上使用二进制方式部署K8S后,创建Metrics插件却屡屡遇坑,此处记录一下填坑过程。部署步骤请参考《【K8S 三】部署 metrics-server 插件》
为了更方便厘清问题,先上一张拓扑图(flanneld网络插件可以换成calico)
填坑过程
问题一:启动metrics server报证书错误:x509: cannot validate certificate for x.x.x.x because it doesn't contain any IP SANs" node="k8s-testing-02-191"
E0725 05:27:26.638019 1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.11.191:10250/metrics/resource\": x509: cannot validate certificate for 192.168.11.191 because it doesn't contain any IP SANs" node="k8s-testing-02-191" I0725 05:27:33.495998 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
解决:
添加参数
- --kubelet-insecure-tls
或者
- --tls-cert-file=/etc/ssl/pki/ca.pem
- --tls-private-key-file=/etc/ssl/pki/ca-key.pem
问题二:metrics server 一直未ready,查看日志报错:Failed to scrape node" err="Get \"https://x.x.x.x:10250/metrics/resource\": context deadline exceeded"
scraper.go:140] "Failed to scrape node" err="Get \"https://linshi-k8s-54:10250/metrics/resource\": context deadline exceeded" node="linshi-k8s-54" server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
解决:
保持--kubelet-preferred-address-types和apiserver一致
问题三:metrics server启动成功,但是执行kubectl top node报错:Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
kubectl top node Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
问题定位:
#-- 查看metrics apiservice的event Message: failing or missing response from https://10.254.156.1:443/apis/metrics.k8s.io/v1beta1: Get "https://10.254.156.1:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.254.156.1:443: i/o timeout Reason: FailedDiscoveryCheck #-- 可以看到kubectl访问metrics的clusterIP超时了,配置apiserver配置--enable-aggregator-routing=true后,发现报错为 Message: failing or missing response from https://172.254.247.87:4443/apis/metrics.k8s.io/v1beta1: Get "https://172.254.247.87:4443/apis/metrics.k8s.io/v1beta1": dial tcp 172.254.247.87:4443: i/o timeout Reason: FailedDiscoveryCheck #-- kubectl直接访问endpoint也超时了 #-- 另:metrics service port只能监听在443上,手动配置成4443报错 Message: service/metrics-server in "kube-system" is not listening on port 443 Reason: ServicePortError 这是因为从该master到metrics server不通导致的;因为部署的master上没有kubelet和kube-proxy,如果apiserver上配置了--enable-aggregator-routing=true,则kubectl命令会直接访问metrics的endpoint,但是master无法访问node的pod网络(因为没有kubelet)。如果不配置--enable-aggregator-routing=true通过metrics service的clusterIP访问呢?因为没有kube-proxy代理导致对clusterIP也是不通(可以参看前面的拓扑图)。
解决:
# 修改metrics server启动YAML文件:
deployment.spec.template.spec.hostNetwork: true
# 或者
# 固定metrics service的地址,然后手动添加路由策略。
同时在/etc/kubernetes/manifests/kube-apiserver.yaml中也要配置apiserver配置--enable-aggregator-routing=true
metrics server启动参数
#--- metricsTLS server的启动参数可以通过下面命令自查询
docker run --rm 192.168.11.101/library/metrics-server:v0.6.1 --help--cert-dir=/tmp
#-- TLS证书存放目录,如果--tls-cert-file and --tls-private-key-file配置了,那么该参数被忽略
--secure-port=4443
#-- 提供带有身份验证和授权的HTTPS服务的端口。如果为0,则不提供HTTPS服务。443(默认)
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
#-- 用于kubelet连接的首选NodeAddressTypes的列表.这里要和kube-apiserver配置保持一致 (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])
--kubelet-use-node-status-port
#-- 使用node状态中的port,优先级高于--kubelet-port
--metric-resolution=30s
#-- metrics-server到kubelet的采集周期,必须设置值至少10s。(默认1m0s)
--kubelet-insecure-tls
#-- 不要验证由Kubelets提供的CA或服务证书。仅供测试之用。如果不用该参数则需要将--tls-cert-file和--tls-private-key-file传入metrics server
--tls-cert-file
#-- 包含用于HTTPS的默认x509证书的文件。如果启用HTTPS服务,且不提供--tls-cert-file和--tls-private-key-file,则生成一个针对公共地址的自签名证书和密钥,并保存到--cert-dir指定的目录中。
--tls-private-key-file
#-- 包含默认的x509私钥匹配的文件--tls-cert-file。
--kubelet-port
#-- The port to use to connect to Kubelets. (default 10250)
附件:kube-metric-server.yaml启动文件
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
name: system:aggregated-metrics-reader
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- nodes/metrics
verbs:
- get
- apiGroups:
- ""
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=30s
- --kubelet-insecure-tls
# - --tls-cert-file=/etc/ssl/pki/ca.pem
# - --tls-private-key-file=/etc/ssl/pki/ca-key.pem
image: HARBOR_HOST_NAME/library/metrics-server:v0.6.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 4443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 200Mi
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- mountPath: /tmp
name: tmp-dir
# - mountPath: /etc/ssl/pki
# name: cert-dir
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
hostNetwork: true
volumes:
- emptyDir: {}
name: tmp-dir
# - name: cert-dir
# hostPath:
# path: /etc/ssl/certs/ca-certs/
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
转载至https://blog.csdn.net/avatar_2009/article/details/126016679
更多推荐
所有评论(0)