Running SQL Server on VMWare with Multiple VMs

sql serversql server 2014vmware

Excuse the long question, this is my first on the board.

I have just joined an organisation that is running a mission critical SQL server 2012 on a VMWare host along with another 9 (at least) VM's (web servers and batch processing servers mainly) on the same physical box.

The Physical box has 6 CPU's with 64GB of Memory at the moment. The disks are tiered so that tier 1 is pure SSD, tier 2 is SAS with SSD cache, and tier 3 something slower. I believe failover is provided at the VM level. It is not set up to reserve any CPU's or memory for the SQL server.

Our main bespoke application system suite runs the company and is a Tier 1 application which has 250GB data spread across multiple databases.

We do not have a massive IT team, and no one who is experienced at the VM and SQL level. Therefore we have hit a bit of an impasse.

Myself (with 25 years development experience) and our Experienced Contract DBA believe that a much better solution which would be simpler and easier to manage would be to avoid the VM route for the database boxes, or to use HyperV and allocate specific resources to the SQL server.

The SQL server (and it's clustered counterpart when we implement it) should live on separate physical boxes, or properly configured HyperV instances. One of the reasons is that the DBA wants to horizontally stripe some of the larger tables, and our infrastructure team have said separate drives per NDB file are not possible with the current setup. We could put the current years data on tier 1 and older data on tier 2.

From reading VMWare best practices, the host should really only contain instances of SQL server anyway, however we have all sorts of VM's installed there. You should also assign specific CPU's to HyperV CPU's for sole SQL use.

As you can imagine, when a slowdown occurs, at the moment, it's very difficult to identify the exact problem as VMware is spreading usage where it wants to.

Our Infrastructure team maintains that this is a valid configuration, but to be honest, myself and our DBA in our 40 odd years of experience have never seen anything like it. The solution we recommend needs to last for at least 3 years before we start looking at the configuration again.

My questions are (and treat this as if cost is not a major issue)

Is the current configuration good design/best practice?
Are there any benefits to the current configuration?
Would we not be better having two physical clustered SQL boxes at different sites and perhaps a mirrored VM host with web servers and possibly some service machines on them?
Any help or suggestions would be seriously appreciated.

Best Answer

There is nothing inherently wrong with the current configuration however you have not included info that would be relavent, such as whether the host is overcommitted on RAM/CPUs. What problem are you trying to solve - performance and/or HA/DR? Assuming you are not overcommitted:

  • Make sure the SQL Server instance(s) have sensible max memory settings. Ensure resource is allocated to accomodate that. If money is no object, use resource pools.
  • VMware (in my opinion) is a better choice than HyperV, but it isn't a big deal which you choose.
  • If you have enough memory to hold all the databases on the instance(s) which are virtualized, your main concern after that is CPU pressure, which is easy to measure. If you don't (which seems to be the case from your description), you're relying on storage for IO much more, so using the SSDs is obviously a good approach.
  • I'd suggest the contractor DBA you are working with is coming up with ideas designed to keep him/her in a job as opposed to solving a problem. Either that or they have no experience of VMware. Or both. You can still allocate resources in VMware in much the same way as with HyperV. Striping tables is not necessarily a performance improvement, depending on the nature of your storage array and the IO requirements of the other VMs.
  • If your SQL instance is Enterprise edition, you ought to use a physical box.
  • WHen you say "6 CPUs" are you talking about an 8 socket box with unpopulated sockets or a single socket populated with 6 cores? If it is the latter, your performance issues will almost certainly be caused by processor availability, as VMware will only use all cores allocated to a VM once they all become available.

Bottom line - it seems that the host is under resourced for memory and CPU cores, which won't go away with HyperV. If you have need for more than 128GB of memory (based on amount of data in all databases) a physical box for the SQL server would be a good choice.

I wouldn't bother with anything fancy like table partitioning until you have sorted your resource issues.