Sql-server – Same query way faster with distinct than without distinct

performancequery-performancesql serversql server 2014t-sql

I have a query that runs very fast with distinct than without distinct. when running both queries, the one with distinct takes 13% execution time, the one without distinct takes 87% execution time.

These are TSQL query in SQL Server 2014 on a database with 110 compatibility mode.

Strange thing is that the query with distinct is using seeks while the query without distinct is using 1 scan. Both query are identical and have the same where clause.

Can you help me understand why the query with the distinct is faster and why the query without distinct is not using a seek ?

Query plans : www.brentozar.com/pastetheplan/?id=ryr18jJsb

Queries:

select 
    SMS_R_SYSTEM.ItemKey
   ,SMS_R_SYSTEM.Name0,SMS_R_SYSTEM.SMS_Unique_Identifier0
   ,SMS_R_SYSTEM.Resource_Domain_OR_Workgr0
   ,SMS_R_SYSTEM.Client0
from vSMS_R_System as SMS_R_SYSTEM
   inner join Add_Remove_Programs_DATA 
      on Add_Remove_Programs_DATA.MachineID = SMS_R_System.ItemKey
   inner join Add_Remove_Programs_64_DATA 
      on Add_Remove_Programs_64_DATA.MachineID = SMS_R_System.ItemKey
where Add_Remove_Programs_DATA.DisplayName00 = 'aze'
   or Add_Remove_Programs_64_DATA.DisplayName00 = 'aze'
;

select distinct 
    SMS_R_SYSTEM.ItemKey
   ,SMS_R_SYSTEM.Name0,SMS_R_SYSTEM.SMS_Unique_Identifier0
   ,SMS_R_SYSTEM.Resource_Domain_OR_Workgr0
   ,SMS_R_SYSTEM.Client0
from vSMS_R_System as SMS_R_SYSTEM
   inner join Add_Remove_Programs_DATA 
      on Add_Remove_Programs_DATA.MachineID = SMS_R_System.ItemKey
   inner join Add_Remove_Programs_64_DATA 
      on Add_Remove_Programs_64_DATA.MachineID = SMS_R_System.ItemKey
where Add_Remove_Programs_DATA.DisplayName00 = 'aze'
   or Add_Remove_Programs_64_DATA.DisplayName00 = 'aze' 
;

Best Answer

From the way I read your execution plans, your DISTINCT query is selecting columns only from vSMS_R_System, so the optimizer chooses to read that data first (retrieving the DISTINCT values used in your SELECT) and then uses a NESTED LOOP to join up against the other tables.

The non-DISTINCT query forces the optimizer to assume that many more rows might need to be read from vSMS_R_System, so it chooses a MERGE JOIN to scan the other tables and then a NESTED LOOP to match up against vSMS_R_System.