Day 39/40 - Troubleshooting Worker Nodes Failures in Kubernetes
About this video
### Summary of the Video Content: 1. **Introduction:** - The video is part of the CK 2024 series (Video #39) by Pyush. - It focuses on troubleshooting worker node failure scenarios, particularly from an exam perspective. 2. **Objective:** - To demonstrate how to troubleshoot and resolve worker node failures in a Kubernetes cluster. - The video aims to cover common issues, their diagnosis, and resolution methods. 3. **Scenario Setup:** - The cluster has a master node that is "Ready," but worker nodes are in "Not Ready" status. - Possible causes include network add-on issues or problems with the Kubelet service. 4. **Network Add-on Troubleshooting:** - Network add-ons like Calico, Flannel, or Weave Net are essential for node-to-node communication. - Steps to verify if the network add-on is installed and running: - Use `kubectl get pods -A` to check pods in the `kube-system` namespace. - Look for specific namespaces like `calico-system` or `flannel`. - Verify configuration files in `/etc/cni/net.d/` to confirm the installed plugin (e.g., `10-calico.conflist`). 5. **Kubelet Service Troubleshooting:** - The Kubelet is a critical node-level agent responsible for node health reporting and communication with the control plane. - Common issues: - Kubelet service is inactive or not running. - Misconfigured Kubelet settings (e.g., incorrect client CA file path in the config). - Commands to diagnose and fix: - Check Kubelet status: `service kubelet status`. - Start Kubelet: `sudo service kubelet start`. - View logs: `journalctl -u kubelet`. - Edit Kubelet config: Update paths in `/var/lib/kubelet/config.yaml`. 6. **Worker Node SSH Access:** - In exams, you may need to SSH into worker nodes to troubleshoot issues. - Example commands: `ssh username@worker-node-ip`. 7. **Fixing Worker Nodes:** - **Worker Node 1:** Kubelet was stopped; restarting the service resolved the issue. - **Worker Node 2:** Incorrect client CA file path in the Kubelet config caused errors; correcting the path and restarting the service fixed it. 8. **Exam Tips:** - Be cautious when using `exit` in terminal sessions to avoid losing progress. - Refer to Kubelet service details (`systemctl cat kubelet`) to locate relevant configuration files. 9. **Next Video Preview:** - The next video will involve a real-time project: hosting a private Docker registry on Kubernetes. - It will integrate concepts learned throughout the series into an end-to-end implementation. 10. **Call to Action:** - Encourages viewers to like, comment, and share their progress. - Sets a target of 300 likes and 100 comments within 24 hours. 11. **Conclusion:** - Thanks viewers for their patience and support. - Promises more advanced troubleshooting topics after completing the series. This summary captures the key points, troubleshooting steps, and overall structure of the video.
Course: Certified Kubernetes Administrator Full Course For beginners | CKA 2025
This playlist contains the complete CKA series for beginners, based on the latest 2025 curriculum. It includes 40+ videos with hands-on demos, assignments, and exam-based scenarios. We will cover everything from the basics to the Advanced, including fundamental concepts such as Docker, containers, Docker storage and networking, DNS, etc.
View Full Course