Sql-server – How to update SQL Server 2005 clustered instances

clusteringfailoverservice-packsql server

We have a 3 node cluster with an instance running on each.
Windows 2003 R2 32bit (16GB) / SQL Server 2005

I want to bring them all up to build 9.00.5266 (SP 4 + CU3) (1 Node is already updated)
The links and articles I have found haven't been very helpful in terms of order installation and gotchas.

Our test cluster nodes were already at 9.00.5266. (My predecessor did that before I arrived).

I updated the test cluster instances to 9.00.5292 (Cluster only has 2 nodes)
Although I got there, this did not go as expected….

The plan was :

Move all instances to node2
Update the instances on node1
Reboot node1
Move all instances to node1
Update the instances on node2
Reboot node2
Move an instance to node1

I got to step 5. Then the installer said I had to use the active node. That means that the instance must go offline.

Is this a pitfall of SQL Server 2005 or am I doing this wrong?

Why did the first update work?

There are only few hotfixes available after build 9.00.5292 (All via links from MS upon request) so I can't experiment any further without reinstalling the test cluster.

I'm sure there are a couple of DBAs out there that have experience with updating SQL Server 2005 clustered instances. [Smile]

Most helpful links found:

SQL Server failover cluster rolling patch and service pack process(2008+ only. No link to 2005) [Sad]
[How to install SQL Server 2005 SP3]3

Best Answer

I think this is a pitfall. I recall SQL 2005 and older versions requires active nodes to be updated. SQL 2008 and later versions allow passive node updates like the walkthrough you described. A posting from Linchi Shea explains it well.

Related Solutions

Sql-server – Windows cluster failed over, but SQL instances didn’t move

Clustering is complex, and there are lots of moving parts (no pun intended). Let me try to break this down into more manageable chunks:

From a terminology perspective, there's your Windows Server Failover Cluster (WSFC), and your SQL Server Failover Cluster Instances (FCI). I try to avoid saying "Cluster" and use these acronyms to avoid ambiguity.

Quorum: The quorum is the number of votes necessary to transact business on your WSFC. Depending on your WSFC configuration, voters can be nodes (servers), a drive, or a file share. You need more than 50% of your votes in order for the WSFC to be online. If you lose 50% or more of your voters, then the WSFC and all clustered services (including your FCI) will go offline and not come back until you have (or force) quorum.

In your configuration, you have two nodes, and one file share for a total of three votes. Any one of those voters can go offline. When you lost the file share, you still had two nodes online, so your WSFC and all clustered services stayed online.

Cluster Owner/Host Server: When you say that "Node2 was now specified as the active node by Windows", I suspect you are referring to the "Current Host Server" for the cluster. So what is that?

Your WSFC has a network name and an IP address. That name & IP has to be tied to a machine that is part of your cluster. More specifically, it can be tied to any one machine in your cluster. This is part of your WSFC, but not your FCI.

In your scenario, you have three FCIs on a two-node WSFC. It would be a perfectly valid to have one FCI on Node1, and two FCIs on Node2. And the "Current Host Server" for the WSFC could be either node. SQL Server won't care.

So what happened: As you said, there were no adverse effects on the databases. I'd expect that, because SQL Server isn't tied to that WSFC host server. I don't think I wouldn't have expected the host server to move when the file share failed--but I'd let your Windows guys dig into that more. From a SQL perspective, everything worked as expected.

Sql-server – SQL Server 2012 failover cluster node won’t start cluster services

I have had not exactly the same but similar issues described here, here and here.

What can I do to start cluster service on node1 back on?

when a node feels alone in a windows server cluster setting, it loses the quorum.

Then, to start the windows cluster failover service without the quorum, you need to force it to start.

it is very important, because generally you would have something like this:

windows cluster name - the name that all the applications try to connect to: le't's say it is called the_server but actually the_server does not exist, what exist is node1 and node2 (plus maybe a quorum disk or a shared storage for quorum purposes) - so even if you have only one node left, you need the windows cluster failover service running, so that all applications can find the_server

The way I would do that is using powershell:

Import-Module FailoverClusters  

$node = "Always OnSrv02"  
Stop-ClusterNode -Name $node  
Start-ClusterNode -Name $node -FixQuorum  

(Get-ClusterNode $node).NodeWeight = 1  

$nodes = Get-ClusterNode -Cluster $node  
$nodes | Format-Table -property NodeName, State, NodeWeight

this is described in detail here:

Force a windows server failover cluster to start without a quorum

To force a cluster to start without a quorum:

Start an elevated Windows PowerShell via Run as Administrator.
Import the FailoverClusters module to enable cluster commandlets.
Use Stop-ClusterNode to make sure that the cluster service is stopped.
Use Start-ClusterNode with -FixQuorum to force the cluster service to start.
Use Get-ClusterNode with -Propery NodeWieght = 1 to set the value the guarantees that the node is a voting member of the quorum.
Output the cluster node properties in a readable format.

First, get hold of the failover cluster manager application, let's have a look at what we have got:

This is how it looks like:

Check the servers inside the nodes:

This is the normal way to manage the clustering services.

However, when there is no quorum we need to force the service to start. For that you need to follow this link.

just don't forget the way to run the powershell:

The way to run this script is command by command First this:

Import-Module FailoverClusters

Then

$node = "SQLPROD2"  
#Stop-ClusterNode -Name $node  
Start-ClusterNode -Name $node -FixQuorum  

(Get-ClusterNode $node).NodeWeight = 1  

$nodes = Get-ClusterNode -Cluster $node  
$nodes | Format-Table -property NodeName, State, NodeWeight

and all good - with only one node up and running - just until we add a new node (but no downtime and no application is upset and not connecting to the_server):

Best Answer

Related Solutions

Sql-server – Windows cluster failed over, but SQL instances didn’t move

Sql-server – SQL Server 2012 failover cluster node won’t start cluster services

Related Question