Challenge 7
Use the following command to break something in the cluster
curl -s https://learn.exoscale.dev/cka/ch/set7.sh | bashThe pod demo does not seem to be running fine and restarts after a few tens of seconds. Try to understand what happens and make sure it is running fine for at least 120 seconds once it is restarted.
Solution
Observe the behavior of the pod you will notice the pod is automatically restarted after a few tens of seconds:
$ kubectl get po -w
NAME READY STATUS RESTARTS AGE
demo 1/1 Running 0 2s
demo 1/1 Running 1 (1s ago) 76sGoing a little deeper in the details, we understand this is due to the liveness probe failing:
$ kubectl describe po demo
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m8s default-scheduler Successfully assigned default/demo to worker1
Normal Pulled 2m7s kubelet Successfully pulled image "k8s.gcr.io/busybox" in 578.053377ms
Normal Pulling 53s (x2 over 2m7s) kubelet Pulling image "k8s.gcr.io/busybox"
Normal Created 52s (x2 over 2m6s) kubelet Created container busybox
Normal Started 52s (x2 over 2m6s) kubelet Started container busybox
Normal Pulled 52s kubelet Successfully pulled image "k8s.gcr.io/busybox" in 512.064103ms
Warning Unhealthy 8s (x6 over 93s) kubelet Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
Normal Killing 8s (x2 over 83s) kubelet Container busybox failed liveness probe, will be restartedIf we check the pod’s specification, we can see the liveness probe checks the existence of a file:
$ kubectl get po demo -o yaml
...
apiVersion: v1
kind: Pod
metadata:
name: demo
...
spec:
containers:
- args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
image: k8s.gcr.io/busybox
imagePullPolicy: Always
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
failureThreshold: 3
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1The /tmp/healthy file is removed after 30 seconds. According to the liveness probe’s specification, 15 seconds after the file is removed the container is considered unhealthy and is restarted.
We can change the command sleep 30 into sleep 120 to have an healthy container for a longer period.
First we get the pod specification
kubectl get po demo -o yaml > demo-pod.yamlNext we change the command:
apiVersion: v1
kind: Pod
metadata:
name: demo
...
spec:
containers:
- args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 120; rm -rf /tmp/healthy; sleep 600 # <- change done here
image: k8s.gcr.io/busybox
imagePullPolicy: Always
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
failureThreshold: 3
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
...Then we replace the existing pod:
cat demo-pod.yaml | kubectl replace --force -f -
pod "demo" deleted
pod/demo replacedThe pod should remain healthy for a longer period (more than 2 minutes):
kubectl get po demo -w
NAME READY STATUS RESTARTS AGE
demo 1/1 Running 0 6s
demo 1/1 Running 1 (1s ago) 2m46s