RabbitMQ故障排查CentOS實用指南
排查RabbitMQ故障需遵循“先看日志、再查指標、最后用工具驗證”的黃金法則,確保問題定位準確高效。
sudo rabbitmqctl status
sudo rabbitmqctl list_connections peer_host peer_port state
sudo rabbitmqctl list_queues name messages consumers state
sudo rabbitmqctl list_permissions -p /
sudo rabbitmq-diagnostics listen
sudo rabbitmq-diagnostics node_health_check
sudo rabbitmq-diagnostics memory_breakdown --unit MB
啟用管理插件后,通過http://<服務器IP>:15672
訪問(默認用戶名/密碼:guest/guest
,僅本地訪問)。核心功能:
ready
消息數過高)、消費者數量(為0則無人消費)。sudo systemctl status rabbitmq-server
(若未運行,嘗試sudo systemctl start rabbitmq-server
)。/var/log/rabbitmq/rabbit@<hostname>.log
,常見錯誤包括:
/etc/rabbitmq/rabbitmq.conf
(或rabbitmq-env.conf
)語法,如端口沖突、路徑錯誤。erl -version
驗證。sudo netstat -tulnp | grep 5672
(AMQP端口)或sudo ss -tulnp | grep 15672
(管理界面端口)檢查端口占用,停止沖突進程(如sudo systemctl stop沖突服務
)。sudo setenforce 0
)或修改配置(sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
)。df -h /var/lib/rabbitmq/
(數據目錄),清理舊日志或數據(如sudo rm -rf /var/log/rabbitmq/*.old
)connection_closed_abruptly
)ping <客戶端IP>
、telnet <服務器IP> 5672
(確認網絡可達)。sudo rabbitmq-diagnostics memory_breakdown
檢查內存(mem_used / mem_limit > 0.8
需擴容),df -h
檢查磁盤(disk_free < disk_free_limit
需清理)。openssl x509 -in /path/to/cert.pem -noout -dates
),客戶端信任CA證書flow control initiated
表示觸發流控,通過sudo rabbitmq-diagnostics node_health_check
確認。mem_used / mem_limit > 0.8
)或磁盤空間不足(disk_free < disk_free_limit
)會觸發流控,需擴容或清理。sudo rabbitmqctl list_queues name messages_ready consumers
,若messages_ready
增長快且consumers
為0,說明消費者未啟動或處理慢(優化消費者代碼或增加實例)ready
消息數過高)sudo rabbitmqctl list_queues name messages_ready
,定位堆積嚴重的隊列。sudo rabbitmqctl list_queues name consumers
,若consumers=0
,需添加消費者或修復消費者服務。messages_ready
與messages_unacknowledged
的比例,若messages_ready
持續增長,說明消費速度慢(優化消費者邏輯或增加并行度)sudo rabbitmqctl list_permissions -p /
,確認用戶是否有對應虛擬主機的configure
、write
、read
權限。sudo rabbitmqctl list_vhosts
,若虛擬主機未創建,需通過sudo rabbitmqctl add_vhost <vhost_name>
創建。sudo rabbitmqctl set_permissions -p /<vhost_name> <username> ".*" ".*" ".*"
(授予所有權限,生產環境建議按需分配)/etc/rabbitmq/rabbitmq.conf
配置日志滾動(如log.rotate
),避免日志占滿磁盤。rabbitmq_node_mem_used
、rabbitmq_queue_messages_ready
等指標,設置閾值告警(如內存使用率>80%時報警)。/var/lib/rabbitmq/mnesia
目錄(Mnesia數據庫,包含隊列、交換機等元數據),避免數據丟失。yum update rabbitmq-server
),修復已知漏洞與bug