k8s hpa无法获取数据问题
问题:k8s hpa failed to get cpu utilization: missing request for cpuHPA的监控数据current字段显示为unknownkubectl describe hpa activator -n knative-servingName:activatorNamespace:
问题:k8s hpa failed to get cpu utilization: missing request for cpu
HPA的监控数据current字段显示为unknown
kubectl describe hpa activator -n knative-serving
Name: activator
Namespace: knative-serving
Labels: serving.knative.dev/release=v0.18.3
Annotations: <none>
CreationTimestamp: Sun, 05 Dec 2021 14:35:18 +0000
Reference: Deployment/activator
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 100%
Min replicas: 1
Max replicas: 20
Deployment pods: 0 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale False FailedGetScale the HPA controller was unable to get the target's current scale: deployments/scale.apps "activator" not found
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 4m27s (x13697 over 2d11h) horizontal-pod-autoscaler deployments/scale.apps "activator" not found
原因:
hpa依赖Metrics组件执行kubectl top 看下是否能够正返回,如果所有的Pod都无数据,请执行kubectl get apiservice检查当前提供resource metrics的数据源的情况。
#kubectl top pod
NAME CPU(cores) MEMORY(bytes)
alpine-g4kjl-deployment-5c7c9cccb6-nk5vk 10m 23Mi
alpine-pmsjq-deployment-8665bbff7b-l7t5s 8m 16Mi
lizheng-tools-57f98f5454-pj7ls 0m 0Mi
my-nginx-546db9c8f7-f2gnw 0m 1Mi
my-nginx-546db9c8f7-stggz 0m 1Mi
nas-0 0m 4Mi
nas-1 0m 3Mi
nginx-mb9t8-deployment-7fc45fdf4b-gxzgp 4m 22Mi
svc-8468754b86-lwpds 0m 3Mi
svc-8468754b86-nvg2f 0m 3Mi
tea-7d76f4d959-psvw5 0m 1Mi
web-nginx-67dbdf84cb-lq6tq 0m 3Mi
web-nginx-67dbdf84cb-q8wrf 0m 3Mi
# kubectl get apiservice
NAME SERVICE AVAILABLE AGE
v1. Local True 8d
v1.admissionregistration.k8s.io Local True 8d
v1.alibabacloud.com Local True 6d17h
v1.apiextensions.k8s.io Local True 8d
v1.apps Local True 8d
v1.authentication.k8s.io Local True 8d
v1.authorization.k8s.io Local True 8d
v1.autoscaling Local True 8d
v1.batch Local True 8d
v1.coordination.k8s.io Local True 8d
v1.crd.projectcalico.org Local True 8d
v1.kuboard.cn Local True 3d4h
v1.monitoring.coreos.com Local True 8d
v1.networking.k8s.io Local True 8d
v1.rbac.authorization.k8s.io Local True 8d
v1.scheduling.k8s.io Local True 8d
v1.serving.knative.dev Local True 2d11h
v1.snapshot.storage.k8s.io Local True 8d
v1.storage.k8s.io Local True 8d
v1.velero.io Local True 43h
v1alpha1.autoscaling.internal.knative.dev Local True 2d11h
v1alpha1.caching.internal.knative.dev Local True 2d11h
v1alpha1.chaosblade.io Local True 6d14h
v1alpha1.jobs.aliyun.com Local True 4d13h
v1alpha1.log.alibabacloud.com Local True 8d
v1alpha1.monitoring.coreos.com Local True 5d15h
v1alpha1.networking.internal.knative.dev Local True 2d11h
v1alpha1.serving.knative.dev Local True 2d11h
v1alpha1.storage.alibabacloud.com Local True 8d
v1beta1.admissionregistration.k8s.io Local True 8d
v1beta1.alert.alibabacloud.com Local True 8d
v1beta1.apiextensions.k8s.io Local True 8d
v1beta1.authentication.k8s.io Local True 8d
v1beta1.authorization.k8s.io Local True 8d
v1beta1.batch Local True 8d
v1beta1.certificates.k8s.io Local True 8d
v1beta1.coordination.k8s.io Local True 8d
v1beta1.csdr.alibabacloud.com Local True 43h
v1beta1.discovery.k8s.io Local True 8d
v1beta1.events.k8s.io Local True 8d
v1beta1.extensions Local True 8d
v1beta1.metrics.k8s.io kube-system/metrics-server True 8d
v1beta1.networking.k8s.io Local True 8d
v1beta1.node.k8s.io Local True 8d
v1beta1.policy Local True 8d
v1beta1.rbac.authorization.k8s.io Local True 8d
v1beta1.scheduling.k8s.io Local True 8d
v1beta1.serving.knative.dev Local True 2d11h
v1beta1.snapshot.storage.k8s.io Local True 8d
v1beta1.storage.alibabacloud.com Local True 8d
v1beta1.storage.k8s.io Local True 8d
v2beta1.autoscaling Local True 8d
v2beta2.autoscaling Local True 8d
如果v1beta1.metrics.k8s.io所对应的apiservice不是kube-system/metrics-server,检查是否由于安装Prometheus Operator覆盖导致。如果是覆盖导致的问题,可以通过部署以下的YAML模板进行恢复。
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io
spec:
service:
name: metrics-server
namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
如果滚动发布或者扩容时无法获取数据。
默认metrics-server的采集周期是1秒。刚扩容或更新完成后,metric-server会有一段时间无法获取监控数据。请于扩容或更新后2秒左右进行查看。
HPA默认是通过实际的利用率/request作为利用率的数值,因此可以检查Pod的Resource字段中是否包含Request字段。检查此字段是否合规。另外检查label标签以及pod名字是否冲突
参考:https://help.aliyun.com/document_detail/181491.html
https://stackoverflow.com/questions/62800892/kubernetes-hpa-on-aks-is-failing-with-error-missing-request-for-cpu
更多推荐
所有评论(0)