Sql-server – Subquery Applying Incorrect Predicate

execution-planperformancequery-performancesql serversql-server-2016

I have a query which contains a subquery. When executed on a local copy of my database it runs as normal and executes within < 1 second.

I've noticed however that if I run this same query on our server the inner query behaves like a correlated subquery, taking around 90 seconds, and the predicate applied to a clustered index scan for data which relates to the subquery appears to be incorrect – it contains the alias used in the outer query, even though all aliases used in the inner query are different, and instead of 1 execution for the index scan it shows hundreds.

I've noticed that this behaviour only applies when the query executes serially – if I lower the cost threshold for parallelism on the server and it goes parallel then this behaviour does not appear and the query runs correctly. Notably, the query runs correctly in my local environment whether serial or parallel, and the predicate applied there looks fine.

Both the local environment and the server are running SQL Server 2016 SP1 CU7 (Enterprise on the server, Developer locally), and I've looked at the different SET options and verified that they are using the same compatibility level (SQL 2012).

Is there anything that would explain the difference of behaviour here?

The problem is not that the plan is different, it's that it's not a correlated subquery and is being executed as one – the inner query is incorrectly referencing the outer query when the query executes serially, but not when it goes parallel on the same server.

The server is set to 2012 compatibility mode because when we upgraded, the new cardinality estimator caused some problems with a few queries and we haven't gotten around to fixing those yet so have remained on 2012 compatibility mode to use the legacy cardinality estimator.

All 4 plans can be viewed here.

Looking at the query and then at the clustered index scan on the deposit payment table, this shouldn't be using the predicate it is using as it references the outer query, which unless I'm misunderstanding, is incorrect. It also shouldn't be executing that operator the 200+ times that it is.

On the same server, if I lower the cost threshold for parallelism so that it goes parallel, it applies a different predicate to that operator where it does not reference the outer table and does not behave like a correlated subquery. This is the same behaviour I see if I run the query locally (either running serially or parallel).

Best Answer

The big difference in the serial plan that I can see is this expensive TOP operator that has been introduced by the optimizer:

enter image description here

The TOP is probably there due to row goals, there is a 4-part series on this at SQLPerformance.com:

(Part 1 and maybe part 2 are relevant for this query.)

I can't confirm for sure in your case because you're using 2016 and the showplan properties that expose row goal information weren't added until 2017 CU3 (and were added in 2016 SP2, just released this week). You can see them if you show the output from running the original, serial query on the server adding the following (taken from part 1 of the series above):

OPTION (QUERYTRACEON 3604, QUERYTRACEON 8607, QUERYTRACEON 8612);

Estimates are different all over the place, so that tells me that data and/or stats are different, but probably mostly stats since local and server both give the same actual rows.

I suggest a rewrite, changing IN to EXISTS, simplifying the subquery (a lot of redundant work there), and removing the GROUP BY (which is unnecessary busy work, and may be the culprit here):

SELECT p.id_prov,
       p.name,
       ps.val
FROM dbo.provider_payment pp
INNER JOIN dbo.provider p 
  ON p.id_prov = pp.id_prov
INNER JOIN dbo.payment_settings ps 
  ON ps.id_prov_payment = pp.id_prov_payment
WHERE pp.payment_method = 'PROTX'
  AND ps.setting = 'VendorName'
  AND p.is_deleted = 0
  AND p.status = 'Y'
  AND EXISTS
  (
    SELECT 1
      FROM dbo.deposit_payment dp
      WHERE dp.id_prov = p.id_prov
        AND dp.payment_method = pp.payment_method
        AND dp.create_date > '2018-01-01'
  );

Updating stats can't hurt, but let us know if the rewrite helps. You could also try hinting the newer cardinality estimator using OPTION (QUERYTRACEON 2312); and in case you are hit by anything that has since been fixed in the optimizer you can also try 4199.

Leaving the local variance out, the two plans on the server have the same predicate on the scan of deposit_payment, except that the parallel plan uses a probe for id_prov. So I really don't think the predicate has anything to do with any performance differences:

enter image description here

The only difference between server and local is that the local serial plan handles the id_prov filter later, in a hash match. Exercise to the reader to determine if that's better or worse. :-)

I would focus on writing simpler queries without unnecessary and/or redundant logic (and perhaps introducing a supporting index on deposit_payment to help this type of query without a scan/predicate) rather than trying to discover bugs. :-)