Sql-server – the impact on the Cost-Based Optimizer (CBO) when implementing ISCSI SAN solid-state drives

configurationfilegroupsoptimizationpartitioningsql server

A Microsoft Technet article recommends creating a secondary file group as the default file group (see reference below). The secondary file group should have a number of files, say four, that are each placed on separate disks. As an added rule-of-thumb from a colleague, the number of files should be equal to the number of CPU cores.

My understanding is that this setup is ideal for mechanical spinning disk drives where there is a performance boost by streaming data from multiple heads because a spinning disk drive is much slower than a solid-state drive. Is this understanding correct?

If yes, then my question is whether the cost-based optimizer takes into account the newer solid state drives? It seems like the performance bottlenecks from the spinning hard drives goes away when switching to the new solid-state drive. Our IT Operations groups has informed me that while I am currently allocated one virtual drive, the data is really stored on an ISCSI SAN, which has multiple solid-state drives.

This question is geared for trying to answer what is the most optimal setup for a large database of this magnitude:

  1. Should I have a default secondary file group with just one file?
  2. Should I have a default secondary file group with a number of files that equals the number of cores on the CPU?
  3. Is there a performance boost by partitioning a database table when using solid state drives?

The current project I am working on requires a scaled database that will be a few terabytes in size that stores massive amounts of log data. A one week sample has approximately 150 million records and we need to store a rolling 3 years of logs. So, I am now looking at long running queries to find data. I have tuned the indexes to the point where nearly all of the work is attributed to non-clustered index seeks; the optimizer does not recommend adding a missing index.

Note

The licensing on Microsoft SQL Server is currently by CPU core. So, there is sensitivity to throwing more cores at the problem, especially if that will not boost performance.

Additionally, I am currently developing on SQL Server 2014, but will be migrating to SQL Server 2017 for both development and production.

Update 1

The project will be loading logs nightly where I expect few (perhaps none) updates or deletes because the logs are not expected to change at all – so they will not be re-loaded. Everything else will be reads for analytic purposes.

The PRIMARY file group for the system tables where the SECONDARY default file group is for everything else. The reason for doing this is explained by the link that is referenced at the bottom of this question.

Separate file groups will be created for the table partition. There are other tables within the database that are small enough where they will reside within the SECONDARY file group – I am only partitioning two tables where one exceeds 100 million records (partitioned by IDENTITY row number) and the other will go into the billions of records (partitioned by time [monthly]).

I plan on partitioning by month over 3 years. So, there will be 36 partitions. I will create file groups for each year and then place 12 files into the corresponding yearly file groups. The partition strategy is to reduce read times as there will be a lot of data scanning for analytic purposes. The annual file group strategy is strictly for ease of maintenance for the DBAs where they can remove a year's worth of data by removing a single file group.

Reference:

SQL Server Best Practices Setting a Default File Group

Best Answer

The cost-based optimizer does not (currently) change its estimated IO costs based on your hardware.

You say this question is geared towards answering what filegroup and partitioning strategy is best for your environment. In that case, who cares what the optimizer does? The question is what works well in your environment, and the real answer is to test it.