Debian Nginx監控與報警實現指南
Nginx自帶的stub_status模塊可提供實時性能數據,是輕量級監控的基礎。
/etc/nginx/nginx.conf或站點配置文件),添加以下內容:server {
listen 80;
server_name localhost;
location /nginx_status {
stub_status on;
allow 127.0.0.1; # 僅允許本地訪問
deny all;
}
}
保存后重啟Nginx:sudo systemctl restart nginx。http://localhost/nginx_status,輸出結果包含:
Active connections:當前活躍連接數(包括Reading/Writing/Waiting);server accepts handled requests:總連接數/成功處理數/總請求數;Reading/Writing/Waiting:讀取請求頭、發送響應、空閑keep-alive連接的連接數。#!/bin/bash
STATUS=$(curl -s http://localhost/nginx_status)
ACTIVE=$(echo "$STATUS" | awk '/Active/ {print $3}')
MAX_CONN=500 # 最大連接數閾值
if [ "$ACTIVE" -gt "$MAX_CONN" ]; then
echo "High active connections: $ACTIVE" | mail -s "Nginx Alert" admin@example.com
fi
將腳本添加到cron(如每5分鐘執行一次):*/5 * * * * /path/to/script.sh。對于分布式或大規模環境,**Prometheus(指標收集)+ Grafana(可視化與報警)**是行業標準方案。
wget https://github.com/nginxinc/nginx-prometheus-exporter/releases/download/v0.11.0/nginx-prometheus-exporter-0.11.0.linux-amd64.tar.gz
tar -zxvf nginx-prometheus-exporter-*.tar.gz -C /usr/local/bin
chmod +x /usr/local/bin/nginx-prometheus-exporter
/metrics接口(需與Exporter配置一致):location /metrics {
stub_status on;
access_log off;
allow 127.0.0.1;
deny all;
}
重啟Nginx。nohup /usr/local/bin/nginx-prometheus-exporter -nginx.scrape-uri=http://localhost/metrics > /dev/null 2>&1 &
/etc/prometheus/prometheus.yml),添加Nginx job:scrape_configs:
- job_name: 'nginx'
static_configs:
- targets: ['localhost:9113'] # Exporter默認端口
重啟Prometheus:sudo systemctl restart prometheus。http://localhost:3000),進入Configuration > Data Sources,選擇Prometheus并配置URL(http://localhost:9090);+ > Import,輸入官方儀表盤ID(如12708),即可查看請求量、響應時間、錯誤率等可視化指標;Alerting > New alert rule,選擇指標(如nginx_http_requests_total),設置條件(如rate(nginx_http_requests_total[5m]) > 1000),并配置通知渠道(如Email、Slack)。Nginx日志(access.log/error.log)是排查問題的關鍵,可通過以下工具實現實時監控與報警。
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$request_time"';
access_log /var/log/nginx/access.log main;
error_log /var/log/nginx/error.log;
重啟Nginx使配置生效。tail -f /var/log/nginx/error.log | grep "HTTP/1.1\" 5";awk '{print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head -10;tail -f /var/log/nginx/access.log | awk '{print $1}' | sort | uniq -c | sort -nr。sudo apt install goaccess
goaccess /var/log/nginx/access.log -o /var/www/html/report.html --log-format=COMBINED
瀏覽器訪問http://server_ip/report.html即可查看;#!/bin/bash
ERR_COUNT=$(grep "HTTP/1.1\" 5" /var/log/nginx/error.log | wc -l)
MAX_ERR=5 # 5xx錯誤閾值
if [ "$ERR_COUNT" -gt "$MAX_ERR" ]; then
echo "High 5xx errors: $ERR_COUNT" | mail -s "Nginx Error Alert" admin@example.com
fi
添加到cron(如每小時執行一次):0 * * * * /path/to/script.sh。報警是監控的最后一環,需根據業務需求設置合理的閾值。常見報警方式:
mail命令發送(需配置Postfix或Sendmail),如上述腳本中的mail -s "Nginx Alert" admin@example.com;Alerting > Notification channels中配置;if ! pgrep nginx > /dev/null; then
systemctl restart nginx
echo "Nginx restarted at $(date)" >> /var/log/nginx_monitor.log
fi
logrotate工具定期壓縮、刪除舊日志,防止磁盤空間耗盡(配置文件:/etc/logrotate.d/nginx)。