Let's say I've a deployment. For some reason it's not responding after sometime. Is there any way to tell Kubernetes to rollback to previous version automatically on failure?
How to achieve Automatic Rollback in Kubernetes?
Answer a question
Answers
You mentioned that:
I've a deployment. For some reason it's not responding after sometime.
In this case, you can use liveness and readiness probes:
The kubelet uses liveness probes to know when to restart a container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a container in such a state can help to make the application more available despite bugs.
The kubelet uses readiness probes to know when a container is ready to start accepting traffic. A Pod is considered ready when all of its containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.
The above probes may prevent you from deploying a corrupted version, however liveness and readiness probes aren't able to rollback your Deployment to the previous version. There was a similar issue on Github, but I am not sure there will be any progress on this matter in the near future.
If you really want to automate the rollback process, below I will describe a solution that you may find helpful.
This solution requires running kubectl commands from within the Pod. In short, you can use a script to continuously monitor your Deployments, and when errors occur you can run kubectl rollout undo deployment DEPLOYMENT_NAME.
First, you need to decide how to find failed Deployments. As an example, I'll check Deployments that perform the update for more than 10s with the following command:
NOTE: You can use a different command depending on your need.
kubectl rollout status deployment ${deployment} --timeout=10s
To constantly monitor all Deployments in the default Namespace, we can create a Bash script:
#!/bin/bash
while true; do
sleep 60
deployments=$(kubectl get deployments --no-headers -o custom-columns=":metadata.name" | grep -v "deployment-checker")
echo "====== $(date) ======"
for deployment in ${deployments}; do
if ! kubectl rollout status deployment ${deployment} --timeout=10s 1>/dev/null 2>&1; then
echo "Error: ${deployment} - rolling back!"
kubectl rollout undo deployment ${deployment}
else
echo "Ok: ${deployment}"
fi
done
done
We want to run this script from inside the Pod, so I converted it to ConfigMap which will allow us to mount this script in a volume (see: Using ConfigMaps as files from a Pod):
$ cat check-script-configmap.yml
apiVersion: v1
kind: ConfigMap
metadata:
name: check-script
data:
checkScript.sh: |
#!/bin/bash
while true; do
sleep 60
deployments=$(kubectl get deployments --no-headers -o custom-columns=":metadata.name" | grep -v "deployment-checker")
echo "====== $(date) ======"
for deployment in ${deployments}; do
if ! kubectl rollout status deployment ${deployment} --timeout=10s 1>/dev/null 2>&1; then
echo "Error: ${deployment} - rolling back!"
kubectl rollout undo deployment ${deployment}
else
echo "Ok: ${deployment}"
fi
done
done
$ kubectl apply -f check-script-configmap.yml
configmap/check-script created
I've created a separate deployment-checker ServiceAccount with the edit Role assigned and our Pod will run under this ServiceAccount:
NOTE: I've created a Deployment instead of a single Pod.
$ cat all-in-one.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: deployment-checker
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: deployment-checker-binding
subjects:
- kind: ServiceAccount
name: deployment-checker
namespace: default
roleRef:
kind: ClusterRole
name: edit
apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: deployment-checker
name: deployment-checker
spec:
selector:
matchLabels:
app: deployment-checker
template:
metadata:
labels:
app: deployment-checker
spec:
serviceAccountName: deployment-checker
volumes:
- name: check-script
configMap:
name: check-script
containers:
- image: bitnami/kubectl
name: test
command: ["bash", "/mnt/checkScript.sh"]
volumeMounts:
- name: check-script
mountPath: /mnt
After applying the above manifest, the deployment-checker Deployment was created and started monitoring Deployment resources in the default Namespace:
$ kubectl apply -f all-in-one.yaml
serviceaccount/deployment-checker created
clusterrolebinding.rbac.authorization.k8s.io/deployment-checker-binding created
deployment.apps/deployment-checker created
$ kubectl get deploy,pod | grep "deployment-checker"
deployment.apps/deployment-checker 1/1 1
pod/deployment-checker-69c8896676-pqg9h 1/1 Running
Finally, we can check how it works. I've created three Deployments (app-1, app-2, app-3):
$ kubectl create deploy app-1 --image=nginx
deployment.apps/app-1 created
$ kubectl create deploy app-2 --image=nginx
deployment.apps/app-2 created
$ kubectl create deploy app-3 --image=nginx
deployment.apps/app-3 created
Then I changed the image for the app-1 to the incorrect one (nnnginx):
$ kubectl set image deployment/app-1 nginx=nnnginx
deployment.apps/app-1 image updated
In the deployment-checker logs we can see that the app-1 has been rolled back to the previous version:
$ kubectl logs -f deployment-checker-69c8896676-pqg9h
...
====== Thu Oct 7 09:20:15 UTC 2021 ======
Ok: app-1
Ok: app-2
Ok: app-3
====== Thu Oct 7 09:21:16 UTC 2021 ======
Error: app-1 - rolling back!
deployment.apps/app-1 rolled back
Ok: app-2
Ok: app-3
更多推荐
所有评论(0)