Sql-server – Effect of where statement in subquery (to get rid of unnecessary join)

sql serversubquerywhere

Please consider the following query:

select * 
from parent p
where p.id in (
  select p.id
  from parent p
  inner join child c on c.parentId = p.id
  where p.age > 50
)

Is it safe (in general!) to rewrite the query to this, in order to save a join? The subquery contains the same alias as the outer query (not a typo).

select * 
from parent p
where p.id in (
  select p.id
  from child c 
  where c.parentId = p.id 
  and p.age > 50
)

My concerns are that the subquery's where-statement is going to be applied to the outer query's result. Is that reasoning correct?

Please note that the query above is just a simple example and the real query is big, where the sub-select is somewhere in some or-condition and contains several joins.

Best Answer

Perhaps it is easier to see with the aid of some rewrites. Your first query can be rewritten to (using the same alias at different levels makes it a bit difficult to understand):

select * 
from parent p1
where exists (
  select 1  
  from parent p2
  join child c on c.parentId = p2.id
  where p2.age > 50
    and p1.id = p2.id 
)

Assuming id is unique in parent this can be rewritten as:

select * 
from parent p1
where exists (
  select 1  
  from child c 
  where c.parentId = p1.id
    and p1.age > 50
)

which in turn can be rewritten to your query.

So you can eliminate the join against parent inside the sub-select. However, your second query is a bit difficult to understand as it is written. I would suggest (I'm guessing your intention)

select * 
from parent p1
where exists (
  select 1  
  from child c 
  where c.parentId = p1.id
)
and p1.age > 50

If parent.id is not unique the above wont hold