Sql-server – Log and Data Drive Configuration in VM environment

sql servertransaction-log

Anyone have any good feeling on how to arrange the underlying disk configuration for a VM (Hyper-V) implementation of SQL Server vs installing straight up on the hardware?

For a native Install, best practices say to put the logs on separate spindles from the data spindes, but what about in a virtual environment? Whereas the VM has virtual disks, we could create separate log .vhd and data .vhd and put those on separate spindles as before, but why not just stripe all your available disks and put the log and data virtual drives on the same set? Hypothetically, the more spindles in the stripe the faster the response and more IOPS, so we could just put both files on the same set. The Host machine would be acting as sort of a SAN.

The server we have is moderately busy, no problem with IO performance (logs and data currently on separate spindles) or CPU or memory pressure. The goal is to have more flexibility in maintenance, being able to clone or move the VM elsewhere, and possibly put another VM on the box to better utilize the hardware. It seems that creating one large striped drive in the host machine would allow us the full use of all available drive space without much impact on performance. The alternative is losing out on about 1/3 drive space just to hold smallish log files.

Server config is Windows 2012 R2 OS, SQL Server 2016. 6 drives in RAID 10 for all of logs tempdb and data -OR- 2 drives in RAID 1 and 4 drives in RAID 10 for tempdb and data.

Much thanks!

Best Answer

It really depends on what's best for your situation as long as you follow some guidelines. We keep the disks for transaction logs separate because often RAID 6 doesn't give us the IOPS we want during peak usage, but many times it does. It depends on your peak IOPS requirements, future growth, and file management. Note that these best practices were created when 7200 and 10k RPM magnetic disks were still expensive.

Isolation, Throughput, Management, and Tail Of Log Backups

One issue with putting it all on 1 big array is that different workloads will now affect your log files where before it wouldn't. If someone does a SELECT * on a huge table and it has to load it into memory, you're fighting for resources off the disk now. All of a sudden you have a different workload when you're trying to troubleshoot what happened.

However, Perhaps you have really good SSDs and your peak writes won't really matter. Another item is you could configure write cache on different arrays so you could dedicate your entire cache to log files. Again, if you're on SSDs on your workload, or you don't have those caching options that might not matter. Another reason to have your own array was to short stroke the disks, that was very helpful for magnetic disks. Remember that data fragmentation could affect the log file too. Sure you could have VLFs (logical fragmentation) taken care of easily but what happens when the underlying disk decides to move data around or you get fragmentation due to how the files are growing? In many cases that won't cause a problem, and in some it will.

Also, note that if anything happens to that array, physical OR logical, you can kiss tail of the log backups goodbye. If your data loss requirements are OK with that then make sure to get that in writing.