PostgreSQL Creating Inefficient Plans in Conditional Joins

execution-planjoin;performancepostgresqlpostgresql-9.3postgresql-performance

Consider these two queries:

SELECT
    t1.id, *
FROM
    t1
INNER JOIN
    t2 ON t1.id = t2.id
    where t1.id > -9223372036513411363;

And:

SELECT
    t1.id, *
FROM
    t1
INNER JOIN
    t2 ON t1.id = t2.id
    where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;

Note: The -9223372036513411363 is not the minimum value in the tables and the condition reduces the result (from the total number of rows, 350 million) to 17 million.

Personally, I expect PostgreSQL to come up with the same plan for both queries, because having t1.id = t2.id automatically implies the second condition. But unfortunately, PostgreSQL is creating two different plans with the plan for the second one being much better:

First query: http://explain.depesz.com/s/uauk
Second query: link: http://explain.depesz.com/s/uQd
EXPLAIN ANALYZE for the second query: http://explain.depesz.com/s/Snkx
(Second query finishes in 215 seconds, while the first one didn't finish after 1000 seconds until I terminated it).

I would highly prefer the first query, since I want to create a view from the join and put the where condition on queries on the view, where I see a single id column (I join using USING so a single id column is visible in view). Also, I will join more than two tables and I would prefer not to add such condition for each join.

Is there any reason for this behavior? Or is it a bug? Are there any workarounds?

Replacing ON t1.id = t2.id with USING (id) makes no difference in both queries.
This is PostgreSQL 9.3
The actual number of returned rows is 17,658,189
Analyze has been run on the tables. However, statistics related settings of PostgreSQL are its default values.
Observation: Explain for query 1 has a good estimate for the final result, but uses a poor plan for querying t2. For 2nd query, estimates of the number of rows from t1 & t2 are good, but estimate for the final merge is about half the number of actual rows.
The id column is primary key in both tables. Tables have around 350,000,000 rows. t1 is around 20GiB & t2 is 14GiB.
Replacing INNER JOIN with LEFT OUTER JOIN produces similar results
Selecting less rows (by increasing the minimum ID value in where condition) doesn't make any differences, until the number of rows become too low in which case it uses a totally different plan.

What I'm trying to achieve

I have a DB with a lot of rows, and new data is being inserted to it continuously. We want to generate different reports for this data, which includes different kinds of queries like: searching for different data, sorting by each column, aggregation queries and so on.

In current design, we have no UPDATE operations. Currently, I'm experimenting with a highly normalized design (based on ideas promoted by Anchor modeling and/or 6NF). Such design would use JOINs and VIEWs extensively to make working with DB pleasant, and so needs a database to be able to do these efficiently.

As far as I can tell (based on problems like this), PostgreSQL doesn't seem to be a good fit to this design (with around 11 tables and a number of views) and seems to almost always perform worse than a less normalized design with one or two tables and no views. I was hoping that this problem in planning JOIN queries is my fault, but it doesn't seem so yet. With this problem, it seems that I should forget using VIEWS and use verbose queries with lots of repeated conditions, or forget using either PostgreSQL or this design.

Tables

The actual number of columns is a bit more, but they are not in any relation with other tables, and so should be irrelevant to this discussion:

CREATE TABLE t1
(
  id bigint NOT NULL DEFAULT nextval('ids_seq'::regclass),
  total integer NOT NULL,
  price integer NOT NULL,
  CONSTRAINT pk_t1 PRIMARY KEY (id)
)

CREATE TABLE t2
(
  id bigint NOT NULL,
  category smallint NOT NULL,
  CONSTRAINT pk_t2 PRIMARY KEY (id),
  CONSTRAINT fk_id FOREIGN KEY (id)
      REFERENCES t1 (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)

Best Answer

Then it looks like an optimizer's blind spot and you should use the second query.

When there is a condition joining two tables a and b: a.id = b.id and an additional condition a.id > @some_constant, seems like the optimizer uses the "index condition" for where to start the index scan on a (id) index but it doesn't use it for the second index b (id).

So, adding the (redundant) b.id > @some_constant allows it to produce a slightly more efficient plan, skipping a part of the b (id) index as well.

This could be posted as a suggestion for improvement (if it hasn't been already) to the Postgres hackers group.

After the edit, we know there is a FOREIGN KEY constraint from t2 that REFERENCES t1. So the "natural" (equivalent) way to write the query would be:

SELECT
    -- whatever
FROM
    t2
  LEFT JOIN
    t1 ON t1.id = t2.id
WHERE t2.id > -9223372036513411363 ;

_{Can you try this and tell us the execution plan it produces? There are some transformations that apply only to LEFT (outer) joins and not to inner joins.}
Unfortunately this doesn't produce any different plan either.

The OP has posted a question at the Postgres performance list, we can see the whole thread here: PostgreSQL seems to create inefficient plans in simple conditional joins and the reply by David Rowley, which confirms that this is a feature that although it has been considered, hasn't yet been implemented in the optimizer:

Yes, unfortunately you've done about the only thing that you can do, and that's just include both conditions in the query. Is there some special reason why you can't just write the t2.id > ... condition in the query too? or is the query generated dynamically by some software that you have no control over?

I'd personally quite like to see improvements in this area, and even wrote a patch 1 which fixes this problem too. The problem I had when proposing the fix for this was that I was unable to report details about how many people are hit by this planner limitation. The patch I proposed caused a very small impact on planning time for many queries, and was thought by many not to apply in enough cases for it to be worth slowing down queries which cannot possibly benefit. Of course I agree with this, I've no interest in slowing down planning on queries, but at the same time understand the annoying poor optimisation in this area.

Although please remember the patch I proposed was merely a first draft proposal. Not for production use.

Related Solutions

Mysql – Why are these two queries having such different executions

LEFT is killing performance; remove it unless you have a good reason for keeping it. To elaborate...

((but first... The purpose for the LEFT was missing from the original query; this answer assumes the LEFT was not necessary.))

LEFT JOIN says that you want data from the right table, whether or not there was a match with the left table.

JOIN says to display only the rows that match (via ON) both tables.

(OUTER is optional and adds no semantics.)

If you are running a version before 5.6, the derived table (subquery in LEFT JOIN) will have no index, so it must be scanned repeatedly. This is a big reason to get rid of LEFT.

Without the LEFT, the Optimizer is likely to evaluate the subquery once, then efficiently JOIN to reporter to finish the query.

For JOIN, the inner query needs this composite (and covering) index INDEX(messagetypeenum, createdOn, reporterid).

There is another technique (I think)... Get rid of the inner SELECT, simply JOIN (or LEFT JOIN) to the table:

select  r.*
    from  reporter r
    left join  nubamessage m  ON m.reporterid = r.id
      AND  m.messagetypeenum = 7
      and  m.createdOn >= '2016-06-18 00:00:00'
      and  m.createdOn <= '2016-06-18 09:30:00'
    WHERE  r.enabled = 1
      AND  m.reporterid is null;

In this case, it may needs INDEX(reporterid, messagetypeenum, createdOn).

Yet another variant would use EXISTS and, I think, provides the equivalent of the LEFT

select  r.*
    from  reporter r
    WHERE  r.enabled=1
      AND NOT EXISTS 
    (
        SELECT  *
            from  nubamessage m
            where  m.messagetypeenum =7
              and  m.createdOn>='2016-06-18 00:00:00'
              and  m.createdOn<='2016-06-18 09:30:00'
              AND  m.reporterid=r.id 
    )

I can't predict which variant will be fastest. It partially depends on how "many" in the many:1 mapping of m.reporterid : r.id.

SQL Server 2014 – Optimizing Join on NULL Key Column as Table and Index Scans

The behavior that you're seeing is caused by the lack of statistics on the table variable. When I want to learn more about why the query optimizer chose a particular plan I sometimes add hints and compare the queries side by side. That approach is helpful here.

First I'll create a table which is close enough in structure to yours to see the same behavior:

CREATE TABLE dbo.Order_Details_Taxes (
    OrdTax_PLTax_LoadDtl_Key decimal(15,0),
    FILLER VARCHAR(30)
);

INSERT INTO dbo.Order_Details_Taxes WITH (TABLOCK)
SELECT NULL, REPLICATE('Z', 30)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

CREATE INDEX [IX_OrdTax_PLTax_LoadDtl_Key] ON Order_Details_Taxes (OrdTax_PLTax_LoadDtl_Key);

To see how the query optimizer costs the different join types I can get an estimated plan for the following:

declare @Keys table (KeyValue decimal(15,0))
insert into @Keys (KeyValue) values (null)

select OrdTax_PLTax_LoadDtl_Key
from @Keys
inner join Order_Details_Taxes
    on OrdTax_PLTax_LoadDtl_Key = KeyValue;

select OrdTax_PLTax_LoadDtl_Key
from @Keys
inner join Order_Details_Taxes
    on OrdTax_PLTax_LoadDtl_Key = KeyValue
OPTION (LOOP JOIN, MAXDOP 1);

select OrdTax_PLTax_LoadDtl_Key
from @Keys
inner join Order_Details_Taxes
    on OrdTax_PLTax_LoadDtl_Key = KeyValue
OPTION (HASH JOIN, MAXDOP 1);

Here is a screenshot of the estimated plans:

SQL Server doesn't know anything about the value of the row in the table variable, so it creates the nested loop plan using the density of the statistics on OrdTax_PLTax_LoadDtl_Key. All of the rows have the same value in the stats so the density is 1. One of the general assumptions of the query optimizer's models is that data exists if the end user is looking for it. So your index seek is expected to return the same number of rows as the scan and has the same cost, despite the fact that the histogram only contains NULLs. In this case, the optimizer doesn't go back and apply special knowledge about NULLs to change the plan. You could argue that the optimizer could be improved to do this, but this does seem like an uncommon scenario.

The difference in costs of the plans ultimately comes down to the costs of the join operators themselves. For whatever reason the query optimizer costs the loop join higher than the merge join. The hash join is costed high as well, but for that one SQL Server expects to need to compute millions of hashes so the higher cost is more understandable imo.

What happens if you get the same plan with a temp table that doesn't have stats? The right way to do this is to disable automatic statistics creation for the table but I'll take a shortcut:

if object_id ('tempdb..#Keys') is not null
    drop table #Keys
create table #Keys (KeyValue decimal(15,0))
CREATE STATISTICS s1 on #Keys (KeyValue) WITH NORECOMPUTE;
insert into #Keys (KeyValue) values (null)

Everything looks the same as the table variable plan:

That's why I said the behavior is caused by the lack of statistics. When you use a temp table and allow auto stats creation the optimizer has a histogram on the temp table's column. It can use that information to generate more accurate cardinality estimates for the nested loop join plan and the index seek:

The histogram suggests that no columns will be matched so you end up with the minimum cardinality estimate of 1 row out of the seek. The costs of the loop join and the seek are reduced accordingly, and the nested loop join plan has by far the lowest cost out of the three join types.

Having some NULL values in the outer table of a join is a significantly more common scenario than joining to a table with all NULLs. In other words, I would expect more better model support for comparing two histograms that both contain NULL compared to a histogram to just NULLs compared to an unknown value. With better model support you can get better cardinality estimates, and in this case the better cardinality estimates result in a significantly more efficient query plan.