1
00:00:02,990 --> 00:00:07,940
In this lecture we will see different ways of troubleshooting control plane failures.

2
00:00:07,940 --> 00:00:12,880
We start by checking the status of the nodes in the cluster. See if they are all healthy.

3
00:00:12,950 --> 00:00:16,570
Then check the status of the PODs running on the cluster.

4
00:00:16,700 --> 00:00:22,210
If we had control plane components deployed as pods, in case of a cluster deployed with the kubeadm

5
00:00:22,210 --> 00:00:29,240
tool, then we can check to make sure that the pods in the kube-system namespace are running. Or else

6
00:00:29,360 --> 00:00:35,330
if the control plane components are deployed as services as in our case, then check the status of the

7
00:00:35,330 --> 00:00:41,510
services such as the kube-apiserver, controller-manager and scheduler on the master nodes. And the


8
00:00:41,510 --> 00:00:44,630
kubelet and kube-proxy service on the worker nodes

9
00:00:48,670 --> 00:00:49,210
Next

10
00:00:49,240 --> 00:00:56,350
check the logs of the control plane components.  Again, in case of kubeadm, use the kubectl logs command

11
00:00:56,410 --> 00:01:02,560
to view the logs of pods hosting the control plane components. In case of services configured natively

12
00:01:02,560 --> 00:01:09,670
on the master nodes, view the service logs using the host’s logging solution. In our case we could use

13
00:01:09,670 --> 00:01:14,390
the journalctl utility to view the kube-apiservers logs.

14
00:01:14,590 --> 00:01:16,400
This should be a good start.

15
00:01:16,750 --> 00:01:22,060
For more tips, refer to the Kubernetes documentation page for Troubleshooting Clusters.

16
00:01:22,060 --> 00:01:25,770
This will help in the upcoming practice test as well as in the exam.