Debian系統中Nginx監控與報警設置指南
要監控Nginx狀態,需先啟用其內置的stub_status模塊,用于暴露基礎連接數指標。
/etc/nginx/nginx.conf或在/etc/nginx/sites-available/default中添加):server {
listen 80;
server_name localhost;
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1; # 僅允許本地訪問,提升安全性
deny all;
}
}
sudo systemctl restart nginx
http://localhost/nginx_status,將顯示類似以下信息:Active connections: 3
server accepts handled requests
100 100 200
Reading: 0 Writing: 1 Waiting: 2
關鍵指標說明:
Active connections:當前活躍連接數(含Reading/Writing/Waiting狀態);accepts:累計接受的連接數;handled:累計成功處理的連接數;requests:累計處理的請求數;Reading/Writing/Waiting:分別表示正在讀取請求頭、發送響應、保持空閑的連接數。Prometheus是一款開源監控系統,通過“拉取”模式收集指標。
wget https://github.com/prometheus/prometheus/releases/download/v2.30.3/prometheus-2.30.3.linux-amd64.tar.gz
tar xvfz prometheus-2.30.3.linux-amd64.tar.gz
cd prometheus-2.30.3.linux-amd64
prometheus.yml,添加Nginx Exporter抓取任務:scrape_configs:
- job_name: 'nginx'
static_configs:
- targets: ['localhost:9113'] # Nginx Exporter的地址
./prometheus --config.file=prometheus.yml
訪問http://localhost:9090可查看Prometheus Web界面。Nginx Exporter將stub_status的指標轉換為Prometheus可識別的格式。
wget https://github.com/nginxinc/nginx-prometheus-exporter/releases/download/v0.11.0/nginx-prometheus-exporter-0.11.0.linux-amd64.tar.gz
tar xvfz nginx-prometheus-exporter-0.11.0.linux-amd64.tar.gz
cd nginx-prometheus-exporter-0.11.0.linux-amd64
./nginx-prometheus-exporter -nginx.scrape-uri=http://localhost/nginx_status
默認監聽9113端口,輸出指標示例:# HELP nginx_http_requests_total Total number of HTTP requests
# TYPE nginx_http_requests_total counter
nginx_http_requests_total{status="200",method="GET",handler="/"} 100
Grafana用于將Prometheus中的指標可視化,并設置報警規則。
sudo apt update && sudo apt install -y grafana
sudo systemctl enable --now grafana-server
http://localhost:3000,使用默認賬號admin/admin登錄。Configuration → Data Sources → Add data source → 選擇Prometheus;http://localhost:9090,點擊Save & Test(需顯示“Data source is working”)。+ → Dashboard → Import;12708,官方Nginx基礎看板),點擊Import。Active connections、Requests per second、5xx error rate等關鍵指標的實時趨勢。編輯Prometheus的rules.yml文件(或在prometheus.yml中添加rule_files),添加以下規則:
groups:
- name: nginx_alerts
rules:
- alert: High5xxErrorRate
expr: sum(rate(nginx_http_requests_total{status=~"5.."}[5m])) / sum(rate(nginx_http_requests_total[5m])) > 0.01 # 5xx錯誤率超過1%
for: 5m # 持續5分鐘觸發
labels:
severity: critical
annotations:
summary: "Nginx 5xx錯誤率過高 (instance {{ $labels.instance }})"
description: "過去5分鐘5xx錯誤占比 {{ $value }},超過1%閾值"
- alert: HighRequestRate
expr: sum(rate(nginx_http_requests_total[1m])) by (instance) > 1000 # 每秒請求數超過1000
for: 2m
labels:
severity: warning
annotations:
summary: "Nginx請求率過高 (instance {{ $labels.instance }})"
description: "當前請求率 {{ $value }},超過1000閾值"
prometheus.yml,添加:rule_files:
- "rules.yml"
重啟Prometheus使規則生效。Alerting → Notification channels;New channel,配置通知方式(如Email、Slack):
Email Alerts;Email;Alerting → Alert rules,找到已創建的規則(如High5xxErrorRate),點擊Edit → Notifications,選擇對應的Notification channel(如Email Alerts)。編輯/etc/nginx/nginx.conf,添加結構化日志格式(如JSON):
http {
log_format json_analytics escape=json '{"time":"$time_iso8601","host":"$host","status":"$status","request_time":"$request_time","remote_addr":"$remote_addr","request":"$request"}';
access_log /var/log/nginx/access.log json_analytics;
error_log /var/log/nginx/error.log;
}
重啟Nginx使配置生效:
sudo systemctl restart nginx
日志字段說明:
status:HTTP狀態碼(如200、500);request_time:請求處理時間(秒);remote_addr:客戶端IP;request:請求路徑與方法(如GET /api/payment)。ngxtop是一款實時日志分析工具,可快速定位異常請求。
sudo apt install -y python3-pip
pip3 install ngxtop
ngxtop -i 'status >= 500' print request_path status request_time
輸出示例:Running for 10 seconds, 123 records processed: 12.3 req/sec
request_path status request_time
/api/payment 500 1.23
/api/user/create 502 0.45
ngxtop -i 'request_time > 1' top request_path request_time
輸出示例:Running for 10 seconds, 456 records processed: 45.6 req/sec
request_path request_time
/api/upload 3.45
/api/report 2.12
Fail2Ban可監控Nginx日志,自動封禁頻繁發起惡意請求的IP。
sudo apt update && sudo apt install -y fail2ban
/etc/fail2ban/jail.local中添加:[nginx-http-auth]
enabled = true
filter = nginx-http-auth
action = iptables[name=HTTP, port=80, protocol=tcp]
logpath = /var/log/nginx/error.log
maxretry = 3 # 3次失敗后封禁
bantime = 3600 # 封禁1小時
sudo systemctl restart fail2ban
查看封禁記錄:sudo fail2ban-client status nginx-http-auth
severity(如critical、warning、info),避免重要告警被淹沒;#!/bin/bash
if ! pgrep nginx > /dev/null; then
systemctl restart nginx
echo "Nginx restarted at $(date)" >> /var/log/nginx_monitor.log
fi
將腳本添加到cron定時任務(如每分鐘執行一次):* * * * * /path/to/script.sh
通過以上步驟,可在Debian系統中構建一套完整的Nginx監控與報警體系,實現對服務狀態、性能指標、日志異常的實時監控,及時發現并解決問題,保障服務高可用性。