Kubernetes日志管理需覆蓋收集→存儲→查看/分析→輪轉/清理→監控告警全鏈路,以下是具體操作步驟:
日志收集是基礎,常見方案需根據集群規模、資源預算選擇:
EFK由Elasticsearch(存儲/索引)、Fluentd(收集/轉發)、Kibana(可視化)組成,適合需要全文檢索、復雜分析的場景。
/var/log/containers/*.log(容器日志)、/var/log/kubelet.log(kubelet日志)等文件,并轉發至Elasticsearch。apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-logging
namespace: kube-system
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.16
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.kube-system.svc.cluster.local" # Elasticsearch服務地址
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
resources:
limits:
memory: 500Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
應用配置后,Fluentd會自動收集節點上所有容器的日志并發送至Elasticsearch。若集群資源有限,可使用Filebeat(輕量級日志收集器)替代Fluentd。Filebeat部署為Sidecar容器,與業務Pod共享卷,收集容器日志并轉發至Elasticsearch。
示例Pod配置:
apiVersion: v1
kind: Pod
metadata:
name: payment-service
spec:
containers:
- name: app
image: payment:v1.2
volumeMounts:
- name: logs
mountPath: /var/log/app
- name: filebeat
image: docker.elastic.co/beats/filebeat:8.9
volumeMounts:
- name: logs
mountPath: /var/log/app
- name: filebeat-config
mountPath: /usr/share/filebeat/filebeat.yml
subPath: filebeat.yml
volumes:
- name: logs
emptyDir: {}
- name: filebeat-config
configMap:
name: filebeat-config
需提前創建ConfigMap配置Filebeat(指向Elasticsearch地址)。
日志需長期保存,常見存儲方案:
k8s-logs-*),使用Discover查看實時日志,Dashboard構建可視化面板(如錯誤日志趨勢、Pod日志量排名)。namespace=prod、pod_name=payment-service)。kubectl logs:查看Pod日志(示例:kubectl logs -f payment-service-abcde -n prod 實時查看);kubectl logs --previous:查看容器重啟前的日志;kubectl logs -c <container-name>:查看多容器Pod中指定容器的日志。容器日志默認存儲在節點/var/log/containers目錄,需通過logrotate配置輪轉規則,避免日志文件過大。
示例/etc/logrotate.d/kubernetes-containers配置:
/var/lib/docker/containers/*/*.log {
daily # 每天輪轉
rotate 7 # 保留7天
compress # 壓縮舊日志
delaycompress # 延遲壓縮(避免壓縮當天日志)
missingok # 文件不存在不報錯
notifempty # 空文件不輪轉
copytruncate # 復制后截斷原文件(不影響正在寫入的日志)
}
此配置會每天輪轉Docker容器日志,保留最近7天的壓縮日志,節省磁盤空間。
結合Prometheus+Alertmanager實現日志監控與告警:
groups:
- name: k8s-log-alerts
rules:
- alert: HighErrorLogs
expr: rate(elasticsearch_indices_indexing_slowlog_total[5m]) > 100
for: 5m
labels:
severity: critical
annotations:
summary: "K8s集群錯誤日志過多 (instance {{ $labels.instance }})"
description: "5分鐘內錯誤日志數超過100條,需立即排查"
timestamp、level、message、pod_name等字段,便于后續檢索和分析。DEBUG(調試)、INFO(常規)、WARN(警告)、ERROR(錯誤),設置不同保留策略(如ERROR日志保留30天,DEBUG日志保留7天)。grok、mutate)脫敏日志中的敏感信息(如銀行卡號、密碼)。resources.limits(如內存限制500Mi),避免因日志量過大導致節點OOM。通過以上步驟,可在CentOS環境的Kubernetes集群中實現高效的日志管理,滿足故障排查、性能優化、安全審計等需求。