Postgresql – Performance: where before or after join

join;performancepostgresqlpostgresql-performance

I am doing a query like this:

select 
   * 
from 
   table1 join table2 on table1.fieldjoin = table2.fieldjoin
where 
   table1.field = 'y'

I would ask is this query equivalent in terms of performance to this:

select 
    * 
from 
  (select * from table1 where field = 'y') as t1 join table2 
  on table1.fieldjoin = table2.fieldjoin

That is by explicting doing a where inside the joined table

Best Answer

A SQL query is a definition of what data to return. There is a component of the database called the optimizer which determines how to get that data - which indexes to use, if any, in which sequence, which join algorithm to use etc. The output of that component is a query plan, the equivalent of a procedural program in the DBMS's internal execution language. The optimizer is free to choose any plan it likes as long as it is guaranteed to produce the output required by the SQL.

Sometimes queries which are logically equivalent but use different syntax will produce quite different query plans. Sometimes the optimizer recognizes the equivalence and produces the same plan for both statements. It can be difficult to predict which without good understanding of the working of the optimizer.

However, the optimizer does make available to us these plans. In PostgreSQL they can be retrieved using EXPLAIN. If the plans for the two queries are the same there will be no performance difference between them. Different plans will have different performance, though it may be slight.

There are a great many factors which go into constructing a plan other than the SQL. So a change in table cardinality (e.g. as time passes, or between production and development environments), or a different version of Postgres may produce a different plans for the same SQL statement.