What can cause higher CPU time and duration for a given set of queries in trace(s) ran on two separate environments

cpuperformancetime

I'm troubleshooting a performance issue in a SQL Server DR environment for a customer. They are running queries that consistently take longer in their environment than our QA environment. After analyzing traces that were performed in both environments with the same parameters/filters and with the same version of SQL Server (2016 SP2) and the exact same database, we observed that both environment were picking the same execution plan(s) for the queries in question, and the number of reads/writes were close in both environments, however the total duration of the process in question and the CPU time logged in the trace were significantly higher in the customer environment. Duration of all processes in our QA environment was around 18 seconds, the customer was over 80 seconds, our CPU time was close to 10 seconds, theirs was also over 80 seconds. Also worth mentioning, both environments are currently configured to MAXDOP 1.

The customer has less memory (~100GB vs 120GB), and slower disks (10k HHD vs SSD) than our QA environment, but but more CPUs. Both environments are dedicated to this activity and should have little/no external load that wouldn't match. I don't have all the details on CPU architecture they are using, waiting for some of that information now. The customer has confirmed they have excluded SQL Server and the data/log files from their virus scanning. Obviously there could be a ton of issues in the hardware configuration.

I'm currently waiting to see a recent snapshot of their wait stats and system DMVs, the data we originally received, didn't appear to have any major CPU, memory or Disk latency pressure. I recently asked them to check to see if the windows power setting was in performance or balanced mode, however I'm not certain that would have the impact we're seeing or not if the CPUs were being throttled.

My question is, what factors can affect CPU time and ultimately total duration? Is CPU time, as shown in a sql trace, based primarily on the speed of the processors or are their other factors I should be taking in to consideration. The fact that both are generating the same query plans and all other things being as close as possible to equal, makes me think it's related to the hardware SQL is installed on.

Best Answer

That production has more CPUs does not matter when you use MaxDOP 1. Then only CPU Clock cycle matters. With many CPUs, you often get lower 1.x GHz clock cycle. A higher clock cycle on your QA could explain some of the difference.

If the database cannot fit into memory, then your hard disk speed difference sounds as if it alone can explain the different duration time.

May I recommend that you install the First Responder Kit from Brent Ozar?
https://www.brentozar.com/first-aid/
This will alert you to the most obvious problems.