Sql-server – comparing left join and outer apply doing the same thing

join;sql serversql-server-2012t-sqltrigger

I have an update trigger on a table that that insert rows in another database/table

the query look like

insert into otherdb.table
select manyfield
from inserted
  left join (select id, 
                    timestamp, 
                    row_number() over (partition by id order by timestamp desc) as rownum
             from sometable) as t1
      on inserted.id = t1.id and
         rownum = 1

sometable have over 7 millions rows and have proper indexing

while doing execution plan i can see that if there is 1 updated row, that windowing fuction (proper name used?) retrieve the whole table do to the rows numbering and get executed once, it is using index scan

same(it will retrieve once the 7 millions rows) if inserted have 20 rows or inserted have 50,000 rows

now the usage of that trigger is mostly 1 row update (we do have some odd case where it will be 20-100 rows and even thousands)

I refactored this query to look like

insert into otherdb.table
select manyfield
from inserted
  outer apply join (select top 1 id, 
                    timestamp
                    from sometable
                    where inserted.id = sometable.id
                    order by timestamp desc) as t1

same result at the end but the execution plan change.

now it is using index seek and it will return 1 row per inserted row

when i do a huge update of 50,000 rows, the execution plan will tell me number of execution 50,000 instead of one when doing the left join version.

to resume everything;

left join: 1 big operation being reused by all inserted row

outer apply: 1 small operation being executed once per inserted row

my question is, at this point i don't know enough about execution plan to decide which kind of join i should keep, left join or outer apply?

this is being used by 500-1000 users at the same time and we have timeout error maybe related to this trigger, we currently use the left join query

EDIT

result of some actual execution plan, select on 55017 rows

left join:

estimated subtree cost: ~78
memory grant: ~156k
estimated number of rows: ~57k

outer apply:

estimated subtree cost: ~99
memory grant: ~163k
estimated number of rows: ~55k

hybrid(solution of Paparazzi but using innner join, for left join see above):

estimated subtree cost: ~87
memory grant: ~170k
estimated number of rows: ~56k


result of some actual execution plan, update on 55017 rows (trigger)

left join:

estimated subtree cost: ~401
memory grant: ~455k
estimated number of rows: ~225k

outer apply:

estimated subtree cost: ~136
memory grant: ~153k
estimated number of rows: ~52k

hybrid(solution of Paparazzi but using innner join, for left join see above):

estimated subtree cost: ~126
memory grant: ~156k
estimated number of rows: ~53k

Best Answer

try this - you might get the best of both

select manyfield 
from  ( select manyfield
             , row_number() over (partition by t1.id order by t1.timestamp desc) as rownum 
          from inserted 
          left join sometable t1 
                 on inserted.id = t1.id
      ) tt
where rownum = 1 or t1.id is null