Sql-server – Remove Local RAID Drive from MS SQL Windows Cluster

sql serverstoragetempdb

We recently stood up a production SQL Server 2014 Enterprise on a two-node Windows Failover cluster (Active/Passive). We had some SSD's laying around and decided to use them for the tempdb for one of the two instances hosted on this cluster. The thought was local storage plus SSD = win.

We did run Dsksp, SQLIO, etc to test this setup before going into production and everything checked out ok. However, once we got went live, we started to see significant I/O wait issues during (and only during) our nightly DB integrity check.

After trying a few things in an attempt to resolve the issue, we decided to move the tempdb off to our SAN storage–which cleared up the I/O wait issues.

We now want to remove the SSD from each node. Before we do so, I want to make sure I understand the process correctly.

These were local disks so they were not assigned as a cluster resource.

From what it appears, now that the tempDB has been moved, we simply need to remove the logical drive from the OS (i.e. take offline/delete), then delete the virtual disk from the RAID controller.

Is it this simple or do is there something more that needs to be in SQL or in the cluster configuration?

I have seen articles about removing shared storage, but nothing touching on local and since this is our first cluster, I want to make sure there isn't anything hidden that will cause down time or otherwise ruin our day.

Thanks

Best Answer

SSDs are great in general, but they are much better suited to read heavily applications, while tempdb is a mix of reads / writes and the nature of tempdb generates a consistent churn which is not good for an SSD. But it sounds like you've already gotten your tempdb off the SSD, so that step is already done.

I've not tried swapping out a local SSD for a SAN drive, but I've been involved in a migration which did that process in reverse (swapped a SAN drive for a local SSD) We failed over to a virtualized server during the process (the main server was a physical server) but it was very straightforward. We might have been able to swap in the SSD and done the entire process in a 30 minute window and skipped the fail over altogether had we known. But the policy at my current work is that any downtime longer than 30 minutes needs to be a failover.

So TLDR, it was very easy, but I wasn't the person responsible for doing the physical drive swap, only for failing over SQL Server, then doing the work to utilize the new SSD, which was reassigned the drive letter that we used for data originally.