K8S部署在Debian上的常見故障及排除步驟
sudo apt update && sudo apt upgrade -y更新系統,避免軟件包沖突;sudo modprobe overlay && sudo modprobe br_netfilter),并通過sysctl設置net.bridge.bridge-nf-call-iptables=1、net.ipv4.ip_forward=1(需寫入/etc/sysctl.d/99-kubernetes.conf并執行sysctl -p生效);sudo swapoff -a臨時關閉,修改/etc/fstab永久禁用(注釋含swap的行)。kubelet服務無法啟動、狀態顯示inactive (dead)或日志報錯(如failed to start container runtime)。systemctl status kubelet(查看是否運行及錯誤信息);journalctl -u kubelet -f(跟蹤實時日志,定位具體錯誤,如鏡像拉取失敗、證書問題);systemctl restart kubelet(修復臨時故障)。kubeadm join時報錯(如token expired、無法獲取ConfigMap、connection refused)。kubeadm token create --print-join-command,獲取新的join命令;kubectl get nodes顯示Ready,且kube-apiserver服務正常(systemctl status kube-apiserver);ping通Master節點的IP(尤其是6443端口,用于API Server通信)。image字段(如nginx:latest是否正確);若使用私有倉庫,在Worker節點的/etc/docker/certs.d/<registry-domain>/目錄下添加證書,并重啟Docker(systemctl restart docker)。kubectl logs <pod-name> -n <namespace>),檢查應用錯誤;調整資源請求/限制(resources.requests和resources.limits)。ClusterIP無法內網訪問、NodePort無法外網訪問)。kubectl get pods -n kube-system(檢查Flannel、Calico等插件Pod是否Running);kubectl delete -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml),再重新應用;br_netfilter已加載(lsmod | grep br_netfilter);sysctl net.bridge.bridge-nf-call-iptables(需為1)。kubectl get svc顯示服務端口正常,但無法通過NodeIP:NodePort或ClusterIP訪問。ClusterIP僅集群內部可訪問,需改為NodePort(type: NodePort)或LoadBalancer(需云廠商支持);ports字段(如targetPort需與應用容器端口一致,port為Service端口,nodePort為可選的外部訪問端口(30000-32767));sudo ufw allow <node-port>)。kubectl命令報錯(如x509: certificate signed by unknown authority)、無法訪問API Server。kubeadm certs renew all,然后重啟相關組件(systemctl restart kubelet);kubectl --insecure-skip-tls-verify=true命令,或修改~/.kube/config中的insecure-skip-tls-verify為true(不推薦生產環境)。Pending(無法調度)、系統頻繁觸發OOM Killer(殺死進程釋放內存)。kubectl describe node <node-name>(查看節點資源分配情況);kubectl top pod(查看Pod資源占用);/var/log/下的舊日志),或擴展磁盤容量;resources.requests(如memory: "512Mi"、cpu: "500m"),避免過度申請資源。kubelet日志報錯(如sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables)、網絡插件無法正常工作。sudo modprobe br_netfilter && sudo modprobe overlay;/etc/modules-load.d/kubernetes.conf文件中(每行一個模塊),然后運行sudo systemctl restart systemd-modules-load。kubeadm init或kubeadm join時報錯(如unsupported Docker version、kubelet版本與kube-apiserver不兼容)。sudo apt remove docker-ce),安裝符合要求的版本。