Postgresql – Why does PostgreSQL not use an index when OR is used in the conditions

indexperformancepostgresqlquery-performance

DB structure summarized:

the main table is cases (about 136k rows)
each case can have 0 – n referencing rows in the table case_contacts
each case contact references a main contact in the table contacts
a case contact may also reference a secondary subcontact, also in the table contacts
contacts have their names in contacts.v_fullname, which is indexed with
a trigram index

The goal is to find cases where the name of a contact or subcontact contains the string "test":

SELECT  c.id,
        c.number
  FROM  cases c
        JOIN case_contacts caco ON caco.case_id = c.id
        JOIN contacts con_main ON con_main.id = caco.contact_id
        LEFT JOIN contacts con_sub ON con_sub.id = caco.subcontact_id
 WHERE  con_main.v_fullname ILIKE '%test%'
        OR con_sub.v_fullname ILIKE '%test%'

This query (query plan) returns the correct result, but does not use the trigram index. It takes around 330ms.

Removing either of the match conditions (query plan), or having them point at the same table (query plan), removes the performance problem. Both of these use the trigram index and are executed in under 1ms, but do not solve the given task.

How can I get PostgreSQL to use my index?

I have simplified this example to the minimum necessary to demonstrate the effect. The actual query is much more complex (and partially auto-generated), so using a UNION of two queries with only one text match each would be very hard, if it's even possible.

I'm using PostgreSQL 9.5.5.
The schema is still open for modifications (to some degree).

As requested, more information about the indexes:

dbname=# \di+ *contacts*
                                         List of relations
 Schema |               Name               | Type  | Owner |     Table     |  Size
--------+----------------------------------+-------+-------+---------------+---------
 public | case_contacts_case_id_idx        | index | x     | case_contacts | 4544 kB
 public | case_contacts_contact_id_idx     | index | x     | case_contacts | 4544 kB
 public | case_contacts_id_case_id_idx     | index | x     | case_contacts | 4544 kB
 public | case_contacts_idx                | index | x     | case_contacts | 9608 kB
 public | case_contacts_pkey               | index | x     | case_contacts | 4544 kB
 public | case_contacts_reference_trgm_idx | index | x     | case_contacts | 4960 kB
 public | case_contacts_subcontact_id_idx  | index | x     | case_contacts | 4544 kB
 public | case_contacts_type_idx           | index | x     | case_contacts | 6208 kB
 public | case_contacts_unique_types_idx   | index | x     | case_contacts | 5464 kB
 public | contacts_parent_id_id_idx        | index | x     | contacts      | 456 kB
 public | contacts_parent_id_idx           | index | x     | contacts      | 360 kB
 public | contacts_pkey                    | index | x     | contacts      | 360 kB
 public | contacts_v_fullname_trgm_idx     | index | x     | contacts      | 1560 kB
(13 rows)

This is how the index on contacts.v_fullname is created:

CREATE INDEX contacts_v_fullname_trgm_idx ON contacts USING GIN (v_fullname gin_trgm_ops);

Best Answer

I cannot really answer your question, because I really don't know why, but I've found a way to make PostgreSQL do more or less what I guess you want. I've tested your situation with a simplified simulation scenario, and using PostgreSQL 9.6.1 (latest as of today). I get the same results.

Good news is: If you can change the way you make your query, you have a couple of options which use the trigram index.

The first one consists on moving the condition on the subcontacts. In this case, the trigram index is used for one of the situations (but not the other):

SELECT
    c.id, c.number
FROM  
    cases c
    JOIN case_contacts caco ON caco.case_id = c.id
    JOIN contacts con_main ON con_main.id = caco.contact_id
    LEFT JOIN 
    (
        SELECT
            * 
        FROM
            contacts  
        WHERE
            v_fullname ilike '%test%' 
    ) AS con_sub ON con_sub.id = caco.subcontact_id
WHERE  
    con_main.v_fullname ILIKE '%test%'
    or con_sub.id is not null /* if the left join gave an answer, it's got '%test%' */ ;

A very few trials with simulated data (where aprox. 0.1%, 2.5%, 5% or 25% of the v_fullname contain '%test%') show that the difference in execution times is minuscule. [My disc is SSD, a real HD might behave very differently.] This should actually be checked with a real system with real data... but it seems that using the trigram index or not, doesn't make a big difference.

PostgreSQL is not exceptionally good at estimating how many rows will appear searching "like '%test%'", but it seems not to matter on which plan decides to use.

There is another option, which (with my little experimentation) works a little bit faster in most cases, and a lot faster when the percentage of '%test%' is low. This option means using a CTE to "prefilter" the contacts (and it uses the trigram index once, because it doesn't need to use it twice):

WITH filtered_contacts AS
(
SELECT
    *
FROM
    contacts
WHERE
    v_fullname ilike '%test%'
)
SELECT
    c.id, c.number
FROM  
    cases c
    JOIN case_contacts caco ON caco.case_id = c.id
    JOIN filtered_contacts con_main ON con_main.id = caco.contact_id
    LEFT JOIN filtered_contacts con_sub ON con_sub.id = caco.subcontact_id
WHERE
    con_main.v_fullname ILIKE '%test%'
    or con_sub.id is not null /* we need this test again, or we'll miss rows */ ;

Related Solutions

Sql-server – Index not making execution faster, and in some cases is slowing down the query. Why is it so

Even though the index is suggested by the SQL Server, why does it slow things down by a significant difference?

Index suggestions are made by the query optimizer. If it comes across a logical selection from a table which is not well served by an existing index, it may add a "missing index" suggestion to its output. These suggestions are opportunistic; they are not based on a full analysis of the query, and do not take account of wider considerations. At best, they are an indication that more helpful indexing may be possible, and a skilled DBA should take a look.

The other thing to say about missing index suggestions is that they are based on the optimizer's costing model, and the optimizer estimates by how much the suggested index might reduce the estimated cost of the query. The key words here are "model" and "estimates". The query optimizer knows little about your hardware configuration or other system configuration options - its model is largely based on fixed numbers that happen to produce reasonable plan outcomes for most people on most systems most of the time. Aside from issues with the exact cost numbers used, the results are always estimates - and estimates can be wrong.

What is the Nested Loop join which is taking most of the time and how to improve its execution time?

There is little to be done to improve the performance of the cross join operation itself; nested loops is the only physical implementation possible for a cross join. The table spool on the inner side of the join is an optimization to avoid rescanning the inner side for each outer row. Whether this is a useful performance optimization depends on various factors, but in my tests the query is better off without it. Again, this is a consequence of using a cost model - my CPU and memory system likely has different performance characteristics than yours. There is no specific query hint to avoid the table spool, but there is an undocumented trace flag (8690) that you can use to test execution performance with and without the spool. If this were a real production system problem, the plan without the spool could be forced using a plan guide based on the plan produced with TF 8690 enabled. Using undocumented trace flags in production is not advised because the installation becomes technically unsupported and trace flags can have undesirable side-effects.

Is there something that I am doing wrong or have missed?

The main thing you are missing is that although the plan using the nonclustered index has a lower estimated cost according to the optimizer's model, it has a significant execution-time problem. If you look at the distribution of rows across threads in the plan using the Clustered Index, you will likely see a reasonably good distribution:

Scan plan

In the plan using the Nonclustered Index Seek, the work ends up being performed entirely by one thread:

Seek plan

This is a consequence of the way work is distributed among threads by parallel scan/seek operations. It is not always the case that a parallel scan will distribute work better than an index seek - but it does in this case. More complex plans might include repartitioning exchanges to redistribute work across threads. This plan has no such exchanges, so once rows are assigned to a thread, all related work is performed on that same thread. If you look at the work distribution for the other operators in the execution plan, you will see that all work is performed by the same thread as shown for the index seek.

There are no query hints to affect row distribution among threads, the important thing is to be aware of the possibility and to be able to read enough detail in the execution plan to determine when it is causing a problem.

With the default index (on primary key only) why does it take less time, and with the non clustered index present, for each row in the joining table, the joined table row should be found quicker, because join is on Name column on which the index has been created. This is reflected in the query execution plan and Index Seek cost is less when IndexA is active, but why still slower? Also what is in the Nested Loop left outer join that is causing the slowdown?

It should now be clear that the nonclustered index plan is potentially more efficient, as you would expect; it is just poor distribution of work across threads at execution time that accounts for the performance issue.

For the sake of completing the example and illustrating some of the things I have mentioned, one way to get a better work distribution is to use a temporary table to drive parallel execution:

SELECT
    val1,
    val2
INTO #Temp
FROM dbo.IndexTestTable AS ITT
WHERE Name = N'Name1';

SELECT 
    N'Name1',
    SUM(T.val1),
    SUM(T.val2),
    MIN(I2.Name),
    SUM(I2.val1),
    SUM(I2.val2)
FROM   #Temp AS T
CROSS JOIN IndexTestTable I2
WHERE
    I2.Name = 'Name1'
OPTION (FORCE ORDER, QUERYTRACEON 8690);

DROP TABLE #Temp;

This results in a plan that uses the more efficient index seeks, does not feature a table spool, and distributes work across threads well:

Optimal plan

On my system, this plan executes significantly faster than the Clustered Index Scan version.

If you're interested in learning more about the internals of parallel query execution, you might like to watch my PASS Summit 2013 session recording.

Postgresql – How to speed up a Postgres query containing lots of Joins with an ILIKE condition

I see a couple of issues.

The biggest one is that PG is using a sequence scan on A when filtering A. I think you need a composite index on A.flag AND A.strvalue. If there is already an index available, PostgreSQL is choosing not to use it for some reason. This seems to be eating up 92% of your cost estimate and is likely what's making it run for so long.

As for the ILIKE, PostgreSQL cannot natively (but see below for a module that can) use an index as long as your wildcard is the first character. That's simply a restriction on the ILIKE operator. For that reason you are getting a sequence scan which means every single row is being loaded and the C.name column is being scanned for characters. But one thing that's weird is that the ILIKE sequence scan doesn't seem to be eating up much of the cost estimate in this query plan. Anyway, if it is the ILIKE operator causing the slowdown, I would consider rewriting your query so that it somehow looks like this: ILIKE 'value%' or else consider using PostgreSQL's full text search.

UPDATED

The ILIKE operator can use a trigram index. Superb!

Best Answer

Related Solutions

Sql-server – Index not making execution faster, and in some cases is slowing down the query. Why is it so

Postgresql – How to speed up a Postgres query containing lots of Joins with an ILIKE condition

Related Question