A couple weeks ago, I went over automating your SQL Server builds. That’s the kind of process that will benefit many SQL Server administrators. But why stop there? I say that the more we can automate, the better. So now let’s talk about automating another aspect of building SQL Servers: availability groups.
Availability groups were the killer feature in SQL 2012 and saw more enhancements with SQL 2014. In my opinion, this is a key feature that allows SQL Server to step into the horizontally scalable world. The problem is that it is not an easy feature to set up or configure. Availability groups have a lot of pitfalls and gotchas to navigate. This is where automation comes in as our ally to build a consistent, repeatable process to deal with these hurdles. Remember, automation is primarily about consistency and we can achieve that by scripting out as much of our build process as possible.
As before, let’s talk about the process we’re going to automate:
- Build a Windows failover cluster consisting of two nodes.
- The two nodes will already have SQL Server installed.
- We will add a file share witness into the cluster (because an even number of quorum votes is bad).
- The cluster will be simple, so we will skip things like heartbeat networks and other more advanced configurations.
- Build an availability group across both nodes of the cluster.
- Availability group will use standard endpoints
- Both nodes will be configured for synchronous commit and automatic failover
- Create a listener with a static IP address.
Building the Cluster
To ease the scripting process, Microsoft provides several cmdlets in the Failover Cluster module. Using these, we can create our cluster.
Import-Module FailoverClusters New-Cluster -Name 'NC1701' -StaticAddress '192.168.10.100' -NoStorage -Node @('PICARD','RIKER')| Set-ClusterQuorum -FileShareWitness '\\hikarudc\qwitness'
As you can see, the cluster creation is pretty easy. Keep in mind this is an extremely simple cluster with none of the usual items we include in production. While you can use this process for your production environment, make sure you work with a clustering expert to define all the additional configurations you will need to build in to make your cluster stable. I also want to call out that I created the fileshare for the witness outside of this script as well.
Before we proceed, I want to validate the cluster. As with everything else we will use Powershell:
Test-Cluster -Name ‘NC1701’
Firewall Ports
An easy thing to overlook is the firewall ports for availability group communication. I know, I forget it all the time! We need to open up 1433 for SQL Server and listener communication and port 5022 for the availability group endpoints. Some quick Powershell resolves this for us:
Invoke-Command -ComputerName RIKER -ScriptBlock {New-NetFirewallRule -DisplayName 'SQL Server' -LocalPort 1433 -Direction Inbound -Protocol TCP} Invoke-Command -ComputerName PICARD -ScriptBlock {New-NetFirewallRule -DisplayName 'SQL Server' -LocalPort 1433 -Direction Inbound -Protocol TCP} Invoke-Command -ComputerName RIKER -ScriptBlock {New-NetFirewallRule -DisplayName 'AG Endpoint' -LocalPort 5022 -Direction Inbound -Protocol TCP} Invoke-Command -ComputerName PICARD -ScriptBlock {New-NetFirewallRule -DisplayName 'AG Endpoint' -LocalPort 5022 -Direction Inbound -Protocol TCP}
Building the Availability Group
Microsoft provides some cmdlets for building availability groups in the SQLPS module. The process is fairly straightforward, since we’re only declaring two synchronous nodes with automatic failover. If we were to have additional nodes, we would need to put in additional logic for determining which nodes would perform which roles.
Import-Module SQLPS -DisableNameCheckin $AGName = 'ENTERPRISE' $PrimaryNode = 'PICARD' $IP = '192.168.10.101/255.255.255.0' $replicas = @() $cname = (Get-Cluster -name $PrimaryNode).name $nodes = (get-clusternode -Cluster $cname).name $sqlperms = @" use [master]; GRANT ALTER ANY AVAILABILITY GROUP TO [NT AUTHORITY\SYSTEM]; GRANT CONNECT SQL TO [NT AUTHORITY\SYSTEM]; GRANT VIEW SERVER STATE TO [NT AUTHORITY\SYSTEM]; "@ foreach($node in $nodes){ Enable-SqlAlwaysOn -Path "SQLSERVER:\SQL\$node\DEFAULT" -Force Invoke-Sqlcmd -ServerInstance $node -Database master -Query $sqlperms $replicas += New-SqlAvailabilityReplica -Name $node -EndpointUrl "TCP://$($node):5022" -AvailabilityMode 'SynchronousCommit' -FailoverMode 'Automatic' -AsTemplate -Version 12 } New-SqlAvailabilityGroup -Name $AGName -Path "SQLSERVER:\SQL\$PrimaryNode\DEFAULT" -AvailabilityReplica $replicas $nodes | Where-Object {$_ -ne $PrimaryNode} | ForEach-Object {Join-SqlAvailabilityGroup -path "SQLSERVER:\SQL\$_\DEFAULT" -Name $AGName} New-SqlAvailabilityGroupListener -Name $AGName -staticIP $IP -Port 1433 -Path "SQLSERVER:\Sql\$PrimaryNode\DEFAULT\AvailabilityGroups\$AGName"
There are a couple cmdlets we’re using to make this all work:
- Enable-SQLAlwaysOn enables Always On with the SQL Server service. This must be done before we can create our AGs.
- New-SqlAvailabilityReplica creates a SMO object that represents our node for the availability group. We signify that we are creating SMO objects by using the -AsTemplate switch. Without that switch, the cmdlet would try creating the actual replica, which we can’t do since the availability group doesn’t actually exist at this point.
- New-SqlAvailabilityGroup requires two items: the primary node the availability group will be created on (declared as a SQL provider path) and the collection of replica nodes. Running the cmdlet will create the availability group and join the replicas to it.
- Join-SqlAvailabilityGroup is used to join the replicas to the availability group. Weirdly enough, when you create the availability group, it won’t join the other replicas, so we have to take an additional step to join the non-primary nodes.
You’ll note we wrap the replica process in a foreach loop built on the Get-ClusterNode output. I use this approach so that, no matter how many nodes my cluster contains, I can add them all without explicit calls to the node names. The loop will add the SMO objects to a collection that will eventually represent all the nodes for the availability group (in this case, only two).
A gotcha to call out is the SQL Statement I execute on each node. When working on this, I kept encountering error 41131. It surprises me, because I believe these permissions should be granted when you enable Always On for the service. However, most of the time this does not happen, so I’ve included the SQL Statement to guarantee that my permissions are correct for creating the availability group
Once we’ve created the availability group, we need to create the listener. To do this, we need the listener name, the port we will use, the static IP that will be assigned to the listener name, and the availability group (declared as the provider path). To ease the implementation, I use the availability group name as the listener name and port 1433. This actually takes the longest of the steps, because the cmdlet has to validate the IP isn’t already in use, but as long as the address is available your listener will be created and your availability group built.
At this point, you should have a functional availability group and can add databases to it.
Another approach
The interesting thing about these cmdlets is they all work using the SMO and related objects. That means that if you want a more controlled approach where you get down and dirty with the code, you can go that route. Thomas Stringer(@sqllife) has a great post on this method and I actually was going down that route originally.
The question I had to ask myself was if it was worth re-inventing the wheel for my availability group builds. I’m not a big fan of how the SQLPS cmdlets rely on the SQL Server provider. Additionally, if you build your script using Thomas’ method, your script will work anywhere you have .Net and doesn’t rely on you installing the SQL Server client tools to get SQLPS. However, using the SMO/.Net method means I have to do a lot more work to get my script running, where the SQLPS cmdlets have taken care of most of that work for me.
Which approach should you use? That’s for you to decide, but I encourage you to look at both methods and evaluate which approach works better for you. When all was said and done, I went with the SQLPS cmdlets because it meant I didn’t have to spend a lot of time writing additional code.
Just as with our SQL Server build process, we’re building consistency and repeatability into our environment. The added benefit is, using this script, I can create my availability group in about a minute. While I may not be creating availability groups often, my script ensures that when I do create them I can maintain my build standards and keep my systems aligned.
[…] Building Availability Groups with #Powershell – Mike Fal (Blog|Twitter) […]
[…] Curah) SSISDB: Why bigger isn’t always better (Added to SQL Server Integration Services Curah) Building Availability Groups with #Powershell Does rebuild index update statistics? Use PowerShell to Create ZIP Archive of Folder SQL Server DML […]
[…] that’s pretty clear what’s going on…quick internet search to find a fix yielded this article from Mike Fal which points to this Microsoft article detailing the issue and […]
Brilliant article Mike…
Just wondering if you guidelines (or article) around automation of failover/failback of Availability group nodes.
Thanks !!!