Prerequisites
Before starting, ensure you meet these requirements:
ntp
or chrony
).Step 1: Install SQL Server on All Nodes
Run these commands on every node to install SQL Server:
# Import Microsoft GPG key
curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
# Add SQL Server repository (replace "22.04" with your Ubuntu version, e.g., "20.04")
sudo add-apt-repository "$(wget -qO- https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list)"
# Update package list and install SQL Server
sudo apt-get update
sudo apt-get install -y mssql-server
After installation, set the SA password and verify the service status:
# Set SA password (replace 'YourStrongPassword' with a secure password)
sudo /opt/mssql/bin/mssql-conf set-sa-password
# Start and enable SQL Server service
sudo systemctl start mssql-server
sudo systemctl enable mssql-server
# Check service status
sudo systemctl status mssql-server
Ensure the service is active (running) before proceeding.
Step 2: Configure SQL Server for Always On
Enable Always On features and create a database endpoint:
# Connect to SQL Server using sqlcmd
sqlcmd -S localhost -U SA -P 'YourStrongPassword'
# Enable Always On (run in sqlcmd)
EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;
EXEC sp_configure 'hadr', 1;
RECONFIGURE;
# Create an availability group endpoint (replace 'Node1', 'Node2' with node names/IPs)
CREATE ENDPOINT [Hadr_endpoint]
STATE=STARTED
AS TCP (
LISTENER_PORT=5022,
LISTENER_IP=(0.0.0.0) -- Allow connections from all nodes
)
FOR DATABASE_MIRRORING (
ROLE=ALL,
AUTHENTICATION=CERTIFICATE server_cert,
ENCRYPTION=REQUIRED ALGORITHM AES
);
-- Exit sqlcmd
GO
QUIT
This endpoint enables secure communication between replicas.
Step 3: Install Pacemaker and Corosync
Pacemaker manages cluster resources; Corosync handles node communication.
Install Packages:
sudo apt-get install -y pacemaker pcs fence-agents resource-agents mssql-server-ha
Set hacluster Password:
All nodes must have the same hacluster
password:
sudo passwd hacluster
Destroy Existing Clusters (if any):
Clean up residual configurations to avoid conflicts:
sudo pcs cluster destroy --all
Create a New Cluster:
Run on the primary node (replace ClusterName
and Node1 Node2
with your cluster name and node names/IPs):
sudo pcs cluster setup --name ClusterName Node1 Node2 --start --all --enable
Verify Cluster Status:
Ensure all nodes are online:
sudo pcs status
You should see all nodes in a running state.
Step 4: Configure Pacemaker for SQL Server AG
Integrate SQL Server with Pacemaker to manage failover.
Create SQL Server Login for Pacemaker:
This login allows Pacemaker to manage the AG:
USE [master];
GO
CREATE LOGIN [pacemaker_login] WITH PASSWORD = N'YourPacemakerPassword';
ALTER SERVER ROLE [sysadmin] ADD MEMBER [pacemaker_login];
GO
Save the credentials securely on all nodes:
echo 'pacemaker_login' >> ~/pacemaker-passwd
echo 'YourPacemakerPassword' >> ~/pacemaker-passwd
sudo mv ~/pacemaker-passwd /var/opt/mssql/secrets/passwd
sudo chown root:root /var/opt/mssql/secrets/passwd
sudo chmod 400 /var/opt/mssql/secrets/passwd
Create AG Resource:
Use the ocf:mssql:ag
agent to manage the AG:
sudo pcs resource create ag_cluster ocf:mssql:ag ag_name=MyAG meta notify=true
Create Virtual IP (VIP) Resource:
The VIP allows clients to connect to the AG via a single IP:
sudo pcs resource create virtualip ocf:heartbeat:IPaddr2 ip=192.168.1.100 cidr_netmask=24
Replace 192.168.1.100
with your desired VIP.
Configure Resource Dependencies:
Ensure the VIP starts only after the AG is running, and the AG promotes to primary on failure:
sudo pcs constraint colocation add virtualip ag_cluster-master INFINITY with-rsc-role=Master
sudo pcs constraint order promote ag_cluster-master then start virtualip
Verify Configuration:
Check the cluster status to confirm all resources are running:
sudo pcs status
Look for the ag_cluster
in master state and the VIP assigned.
Step 5: Test Failover
Validate that the AG fails over automatically or manually:
Simulate Primary Failure:
Stop SQL Server on the primary node:
sudo systemctl stop mssql-server
Check Cluster Status:
Run sudo pcs status
on any node. The AG should promote a secondary replica to master.
Recover Primary Node:
Restart SQL Server and verify it rejoins the AG as a secondary:
sudo systemctl start mssql-server
The cluster should automatically rebalance roles.
Step 6: Connect to the AG
Clients connect to the virtual IP (e.g., 192.168.1.100
) instead of individual nodes. For example, in SQL Server Management Studio (SSMS), use the VIP as the server name.
Optionally, create a DNS record (e.g., sqlag.example.com
) pointing to the VIP for easier access.
Troubleshooting Tips
ntp
or chrony
to keep nodes synchronized./var/log/syslog
) and SQL Server logs (/var/opt/mssql/log/errorlog
) for errors.pacemaker_login
has sufficient permissions (sysadmin role is recommended for simplicity).