Sql-server – Workload group “Importance” vs. Resource pool IOPS throttling

resource-governorsql serversql server 2014

Our SQL 2014 server has a blazing-fast tempdb drive (2800 IOPS) but a much slower data drive (500 IOPS). Our application runs a few long-running reporting queries that are I/O-intensive and we'd like to avoid them starving our server for I/O capacity when they run.

Ideally we'd be able to limit these queries to 50% of available I/O capacity. Unfortunately SQL Server Resource Pool's IOPS throttling is not percentage-baed nor volume-specific. If I limit to 250 IOPS, then it will unnecessarily slow down performance of queries that make heavy demands on tempdb. Slowing down these long-running queries if the server is busy is OK, but slowing them down by 10x+ if they need lots of tempdb access is not OK.

So we're looking for workarounds that will defend other queries from these lower-priority, long-running queries, but without unnecessarily hurting performance of these long-running queries if they happen to use lots of tempdb. It's not practical to change the queries themselves to reduce tempDB usage– these queries are generated by a custom reporting feature that may sometimes generate really complex query plans that spill results to tempdb.

So far the best idea I have is to remove IOPS throttling and instead use the "Importance" of a workload group to defend the rest of the server's I/O capacity from these queries. Is this a good solution to the problem I'm trying to solve? What are the pros and cons of using Importance?

Or is there a better way to achieve our goals?

Best Answer

Ideally we'd be able to limit these queries to 50% of available I/O capacity. Unfortunately SQL Server Resource Pool's IOPS throttling is not percentage-baed nor volume-specific.

Correct. Not to delve too deep into this since it isn't configurable - but let's take a quick second and think about that sentence and what you're really talking about here.

How do you know what your IO capacity is? Let me take your information:

Our SQL 2014 server has a blazing-fast tempdb drive (2800 IOPS) but a much slower data drive (500 IOPS).

Is that 2800 IOPs @ 4k? 8k? 256k? Is that at outstanding IO of 1? 4? 8? 24? 128? Should SQL Server now monitor the entire server for everything IO related? What if it's in a hyper-visor and can't "see" the base infrastructure... now what? Is that 2800 IOPs with 99% IOs within 10 ms? 20 ms? 100 ms?

This isn't as easy as saying "50%" as there are too many variables of what 50% means... Not nit picking, just not as easy of a subject as one may think.

So far the best idea I have is to remove IOPS throttling and instead use the "Importance" of a workload group to defend the rest of the server's I/O capacity from these queries.

Importance doesn't touch IO, it deal with scheduling. High runs more often than normal which runs more often than low. It doesn't mean it won't eat up your CPU and IO, it just tries to be more fair with scheduling based on your interpretation of the workload (i.e. those marked high are more important). This most likely will not help what you're attempting to solve.

Or is there a better way to achieve our goals?

Yes :) Scale out the reporting workload to something such as a readable secondary in an availability group, transactional replication subscriber, read-only copy of the database (depending on data freshness required), CDC/CT to another server with SSIS, etc. It'll also isolate your storage subsystems (unless it's on the same SAN/NAS/etc.) from each other.