I'm trying to join a few rows from a remote view to a local table. The view has about 300 million rows, so I want to use the REMOTE query hint so that all 3 million rows don't have to be transferred to my computer.
SELECT R.Something, L.ID, L.Something
FROM [dbo].[LocalTable] L
INNER JOIN (
SELECT TOP 100 Something, L_ID FROM [RemoteServer].[RemoteDB].[dbo].[RemoteTable]
) R
ON L.ID = R.L_ID
This returns 100 rows, as I expected, and takes basically no time, as I expected.
However,
SELECT R.Something, L.ID, L.Something
FROM [dbo].[LocalTable] L
INNER REMOTE JOIN (
SELECT TOP 100 Something, L_ID FROM [RemoteServer].[RemoteDB].[dbo].[RemoteTable]
) R
ON L.ID = R.L_ID
starts to return thousands of rows. I quit it after a few seconds, but it was in the tens – hundreds of thousands.
How could a query hint change my result set?
Best Answer
TOP 100
with noORDER BY
means it is undeterministic which 100 rows from the remote table end up participating in the join. This is execution plan dependant and can vary.If it is a one to many relationship it may be the case that one batch of 100 rows has more matches on the other side of the join than another different batch of 100 rows.
You should specify an
ORDER BY
(inside the derived table) on some unique column or combination of columns to ensure deterministic results.