健康检查
Kubernetes 作为一个面向应用的集群管理工具,需要确保容器在部署后确实处在正常的运行状态。

容器探测用于检测容器中的应用实例是否正常工作,是保障业务可用性的一种传统机制。如果经过探测,实例的状态不符合预期,那么kubernetes就会把该问题实例“摘除”,不承担业务流量。

Kubernetes 提供了两种探针(Probe,支持 exec、tcpSocket 和 http 方式) 来探测容器的状态:

Pod 通过两类探针检查容器的健康状态:

LivenessProbe 探针

存活性探针,用于判断容器是否健康,告诉 Kubelet 一个容器什么时候 处于不健康的状态。

如果 LivenessProbe 探针探测到容器不健康,则 Kubelet 将删 除该容器,并根据容器的重启策略做相应的处理。如果一个容器不包含 LivenessProbe 探针,那么 Kubelet 认为该容器的 LivenessProbe 探针返回的值永 远是 “Success”。Kubelet 定期调用容器中的 LivenessProbe 探针来诊断容器的健康状况。

ReadinessProbe 探针

就绪性探针,用于判断容器是否启动完成且准备接收请求。

如果 ReadinessProbe 探针探测到失败,则 Pod 的状态将被修改(连续探测3次之后Ready状态不可用,STATUS状态变为Complete)。Endpoint Controller 将从 Service 的 Endpoint 中删除包含该容器所在 Pod 的 IP 地址的 Endpoint 条目。

livenessProbe 决定是否重启容器,readinessProbe 决定是否将请求转发给容器

探针是由 kubelet 对容器执行的定期诊断。要执行诊断,kubelet 调用由容器实现的 Handler。探针的三种类型处理方式:

Exec:在容器内部执行一个命令,如果该命令的退出状态码为 0,则表明容器健康;

……
  livenessProbe:  # 两种探针写法一致
    exec:
      command:
      - cat
      - /tmp/healthy
……

tcpSocket:通过容器的 IP 地址和端口号执行 TCP 检查,如果端口能被访 问,则表明容器健康;

……
  livenessProbe:  # 两种探针写法一致
    tcpSocket:
      port: 8080
……

httpGet:通过容器的 IP 地址和端口号及路径调用 HTTP GET 方法,如果 响应的状态码大于等于 200 且小于 400,则认为容器状态健康。

……
  livenessProbe:  # 两种探针写法一致
    httpGet:
      path: /  # URI地址
      port: 80  # 端口号
      host: 127.0.0.1  # 主机地址
      scheme: HTTP  # 支持的协议 HTTP或HTTPS
……

LivenessProbe 和 ReadinessProbe 探针包含在 Pod 定义的 spec.containers.{某个容器} 中。这两个探针除了上述三种方式的子属性,还有同样的子属性:

[root@k8s-master ~]# kubectl explain pod.spec.containers.livenessProbe/readinessProbe
KIND:     Pod
VERSION:  v1
RESOURCE: livenessProbe <Object> / readinessProbe <Object>
FIELDS:   
   exec	<Object>
   httpGet	<Object>
   tcpSocket	<Object>
   initialDelaySeconds	<integer>   # 容器启动后等待多少秒执行第一次探测
   timeoutSeconds	<integer>       # 探测超时时间。默认1秒,最小1秒
   periodSeconds	<integer>       # 执行探测的频率。默认是10秒,最小1秒
   failureThreshold	<integer>       # 连续探测失败多少次才被认定为失败、默认是3,最小值是1
   successThreshold	<integer>       # 连续探测成功多少次才被认定为成功。默认是1

每次探测都将获得以下三种结果之一:

成功:容器通过了诊断。
失败:容器未通过诊断。
未知:诊断失败,因此不会采取任何行动
Liveness 探测和 Readiness 探测比较:

Liveness 探测和 Readiness 探测是两种 Health Check 机制,如果不特意配置,Kubernetes
将对两种探测采取相同的默认行为,即通过判断容器启动进程的返回值是否为零来判断探测是否成功。

两种探测的配置方法完全一样,支持的配置参数也一样。不同之处在于探测失败后的行为:Liveness 探测是重启容器;Readiness探测则是将容器设置为不可用,不接收 Service 转发的请求。

Liveness 探测和 Readiness 探测是独立执行的,二者之间没有依赖,所以可以单独使用,也可以同时使用。

用 Liveness 探测判断容器是否需要重启以实现自愈;用 Readiness 探测判断容器是否已经准备好对外提供服务

【例 】

ReadinessProbe探针(就绪检测)

[root@k8s-master ~]# vim readiness.yml
apiVersion: v1
kind: Pod
metadata:
  name: readiness-httpget-pod
  namespace: default
spec:
  containers:
  - name: readiness-httpget-container
    image: nginx
    imagePullPolicy: IfNotPresent
    readinessProbe:    # 就绪检测指针
      httpGet:    # httpGet 检测方式
        port: 80
        path: /index1.html
      initialDelaySeconds: 1    # 容器启动1秒后检测
      periodSeconds: 3    # 检测失败后多少秒后重试
[root@k8s-master ~]# kubectl create -f readiness.yml



# 正在拉取镜像
[root@k8s-master ~]# kubectl get pods
NAME                    READY   STATUS              RESTARTS   AGE
readiness-httpget-pod   0/1     ContainerCreating   0          9s


# 镜像拉取成功,虽然状态为running,但是没有 READY


[root@k8s-master ~]# kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
readiness-httpget-pod   0/1     Running   0          108s
[root@k8s-master ~]# kubectl describe pod readiness-httpget-pod
……省略……
  Normal   Pulled     2m49s                  kubelet, k8s-node2  Successfully pulled image "nginx"
  Normal   Created    2m49s                  kubelet, k8s-node2  Created container readiness-httpget-container
  Normal   Started    2m49s                  kubelet, k8s-node2  Started container readiness-httpget-container


# 探测失败,因为不存在/index1.html文件


  Warning  Unhealthy  107s (x21 over 2m47s)  kubelet, k8s-node2  Readiness probe failed: HTTP probe failed with statuscode: 404

# 进入容器创建index1.html文件
[root@k8s-master ~]# kubectl exec -it readiness-httpget-pod /bin/bash
root@readiness-httpget-pod:/# ls
bin   docker-entrypoint.d   home   media  proc	sbin  tmp
boot  docker-entrypoint.sh  lib    mnt	  root	srv   usr
dev   etc		    lib64  opt	  run	sys   var
root@readiness-httpget-pod:/# cd usr/share/nginx/html/
root@readiness-httpget-pod:/usr/share/nginx/html# ls
50x.html  index.html
root@readiness-httpget-pod:/usr/share/nginx/html# echo "test" > index1.html
root@readiness-httpget-pod:/usr/share/nginx/html# exit
exit


# 此时容器已 READY(就绪)


[root@k8s-master ~]# kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
readiness-httpget-pod   1/1     Running   0          9m11s

LivenessProbe 探针(存活检测)

exec 方式

[root@k8s-master ~]# vim liveness.yaml
apiVersion: v1
kind: Pod
metadata:
  name: liveness-exec-pod
  namespace: default
spec:
  containers:
  - name: liveness-exec-container
    image: nginx
    imagePullPolicy: IfNotPresent
    command: ["/bin/sh","-c","touch /tmp/live && sleep 60 && rm -rf /tmp/live"]
    livenessProbe:
      exec:
        command: ["ls","/tmp/live"]  # 探测操作,通过该命令执行后的状态码,0表示探测状态为正常,非0不正常
      initialDelaySeconds: 1  # 容器启动 1 秒之后开始探测
      periodSeconds: 3  # 指定探测间隔 5 秒 ,连续探测 3 次失败,则会重启容器

# 创建
[root@k8s-master ~]# kubectl create -f liveness.yaml 
pod/liveness-exec-pod created


# 查看


[root@k8s-master ~]# kubectl get pods
NAME                READY   STATUS              RESTARTS   AGE
liveness-exec-pod   0/1     ContainerCreating   0          2s
[root@k8s-master ~]# kubectl get pods -o wide -w  # RESTARTS一直在增长,因为就绪条件未满足,容器会一直重启
NAME                READY   STATUS    RESTARTS   AGE   IP           NODE        NOMINATED NODE   READINESS GATES
liveness-exec-pod   1/1     Running   0          44s   10.244.2.7   k8s-node1   <none>           <none>
liveness-exec-pod   1/1     Running   1          101s   10.244.2.7   k8s-node1   <none>           <none>


# RESTARTS表示pod重启次数,因为容器启动后的60s后删除了/tmp/live文件,而liveness探针正是检测该,后面的3600秒则是防止docker容器
# 文件来判断pod的存活。文件不存在,探测不到该文件,那么就会删除pod,然后重新启动pod,周而复始循环

[root@k8s-master ~]# kubectl describe pod liveness-exec-pod 
……省略……
Events:
  Type     Reason     Age                  From                Message
  ----     ------     ----                 ----                -------
  Normal   Scheduled  <unknown>            default-scheduler   Successfully assigned default/liveness-exec-pod to k8s-node2
  Normal   Pulled     76s (x3 over 3m29s)  kubelet, k8s-node2  Container image "nginx:1.8" already present on machine
  Normal   Created    76s (x3 over 3m29s)  kubelet, k8s-node2  Created container liveness-exec-container
  Normal   Started    76s (x3 over 3m29s)  kubelet, k8s-node2  Started container liveness-exec-container
  Warning  Unhealthy  15s                  kubelet, k8s-node2  Liveness probe errored: rpc error: code = Unknown desc = container not running (180426712bd129c122f4df1432ef573cf1f56627a92ae76bf6db42fc4054df95)
  Warning  BackOff    12s (x4 over 89s)    kubelet, k8s-node2  Back-off restarting failed container

httpGet 方式

[root@k8s-master ~]# vim liveness_httpget.yml
apiVersion: v1
kind: Pod
metadata:
  name: liveness-httpget-pod
  namespace: default
spec:
  containers:
  - name: liveness-httpget-container
    image: nginx
    imagePullPolicy: IfNotPresent
   livenessProbe:   
      httpGet:    
        port: 80
        path: /index.html
        scheme: HTTP    # host 未指定就是本容器地址。其实就是访问 http://127.0.0.1:80/index.html
      initialDelaySeconds: 1    
      periodSeconds: 3    
      timeoutSecond: 10
      
[root@k8s-master ~]# kubectl create -f liveness_httpget.yml 
pod/liveness-httpget-pod created
[root@k8s-master ~]# kubectl get pods 
NAME                   READY   STATUS    RESTARTS   AGE
liveness-httpget-pod   1/1     Running   0          15s

# 删除 index.html,模拟存活探针的探测条件不满足
[root@k8s-master ~]# kubectl exec liveness-httpget-pod -it -- rm -rf /usr/share/nginx/html/index.html

# 因为liveness探针检测文件不存在,会删除容器,并重启pod
[root@k8s-master ~]# kubectl get pods 
NAME                   READY   STATUS    RESTARTS   AGE
liveness-httpget-pod   1/1     Running   1          104s

tcpSocket方式

[root@k8s-master ~]# vim liveness_tcp.yml 
apiVersion: v1
kind: Pod
metadata:
  name: liveness-tcp-pod
  namespace: default
spec:
  containers:
  - name: liveness-tcp-container
    image: nginx
    imagePullPolicy: IfNotPresent
    livenessProbe:
      tcpSocket:
        port: 8080  
      initialDelaySeconds: 1    
      periodSeconds: 3    
      timeoutSeconds: 10
      
[root@k8s-master ~]# kubectl create -f liveness_tcp.yml
pod/liveness-tcp-pod created

[root@k8s-master ~]# kubectl get pods -o wide
NAME               READY   STATUS    RESTARTS   AGE   IP            NODE        NOMINATED NODE   READINESS GATES
liveness-tcp-pod   1/1     Running   0          7s    10.244.2.10   k8s-node1   <none>           <none>
# 因为8080端口未使用,所以一直无法探测到建立连接,所以会一直会删除容器,重启pod
[root@k8s-master ~]# kubectl get pods -o wide -w
NAME               READY   STATUS    RESTARTS   AGE   IP            NODE        NOMINATED NODE   READINESS GATES
liveness-tcp-pod   1/1     Running   1          13s   10.244.2.10   k8s-node1   <none>           <none>
liveness-tcp-pod   1/1     Running   2          21s   10.244.2.10   k8s-node1   <none>           <none>
liveness-tcp-pod   0/1     CrashLoopBackOff   2          29s   10.244.2.10   k8s-node1   <none>           <none>
liveness-tcp-pod   1/1     Running            3          42s   10.244.2.10   k8s-node1   <none>           <none>
liveness-tcp-pod   1/1     Running            4          50s   10.244.2.10   k8s-node1   <none>           <none>
liveness-tcp-pod   0/1     CrashLoopBackOff   4          59s   10.244.2.10   k8s-node1   <none>           <none>

就绪检测+存活检测

[root@k8s-master ~]# cat live+read.yml 
apiVersion: v1
kind: Pod
metadata:
  name: liveness-readiness-pod
  namespace: default
spec:
  containers:
  - name: liveness-readiness-container
    image: nginx
    imagePullPolicy: IfNotPresent
    readinessProbe:
      httpGet:    
        port: 80
        path: /index1.html
      initialDelaySeconds: 1 
      periodSeconds: 3    
    livenessProbe:   
      httpGet:    
        port: 80
        path: /index.html
      initialDelaySeconds: 1    
      periodSeconds: 3    
      timeoutSeconds: 10
# 就绪检测,如果据徐检测不成立,那么pod会一直未就绪,就绪探针会一直检测,直到检测就绪条件满足,pod便会就绪;
# 存活检测,如果存活检测条件不成立,那么会删除容器,重启pod,存活探针会一直检测,周而复始,直到检测条件满足便会正常运行下去
[root@k8s-master ~]# kubectl create -f live+read.yml
pod/liveness-readiness-pod created

[root@k8s-master ~]# kubectl get pods -o wide -w
NAME                     READY   STATUS    RESTARTS   AGE   IP            NODE        NOMINATED NODE   READINESS GATES
liveness-readiness-pod   0/1     Running   0          7s    10.244.2.12   k8s-node1   <none>           <none>
# 因为此时就绪检测条件未成立,所以pod一直未就绪,即index1文件不存在

# 创建该文件
[root@k8s-master ~]# kubectl exec liveness-readiness-pod -it --  touch /usr/share/nginx/html/index1.html

# 就绪条件满足,pod 就绪
[root@k8s-master ~]# kubectl get pods -o wide -w
NAME                     READY   STATUS    RESTARTS   AGE     IP            NODE        NOMINATED NODE   READINESS GATES
liveness-readiness-pod   1/1     Running   0          5m17s   10.244.2.12   k8s-node1   <none>           <none>
# 该Pod会一直正常运行下去,因为无论是就绪检测还是存活检测条件都是满足的。
# 此时,删除index.html文件,模拟存货条件不满足
[root@k8s-master ~]# kubectl exec liveness-readiness-pod -it --  rm -rf /usr/share/nginx/html/index.html
[root@k8s-master ~]# kubectl get pods -o wide -w
NAME                     READY   STATUS    RESTARTS   AGE     IP            NODE        NOMINATED NODE   READINESS GATES
liveness-readiness-pod   1/1     Running   0          7m27s   10.244.2.12   k8s-node1   <none>           <none>
liveness-readiness-pod   0/1     Running   1          7m33s   10.244.2.12   k8s-node1   <none>           <none>
# 可以看到pod被重启,重启后index文件会重新生成,所以所有条件又满足,pod会继续正常运行下去
Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐