Postgresql – Postgres: How to optimize time constrained wildcard search

full-text-searchoptimizationperformancepostgresqlpostgresql-performancequery-performance

I'd like to optimize Postgres response times for a query which looks for

select
    orders.id,
    orders.created
from
    orders left join order_items on     order_items.order_id = orders.id
where
    orders.created >= '2011-10-02 23:00:00' 
    and orders.created <= '2017-12-09 22:59:59.999' 
    and orders.total_price >= '2000000'
    and orders.total_price <= '2147483647'
    and order_items.article_name ilike  '%test%' 
order by
    orders.created desc limit '40' offset '0'

In my environment it's typical that the date range is about 2-5 days and the stored orders are up to 2 years ago. 99% of the query time is used for the gist trigram index scan on article_name. The time range currently has negligible performance impact: it doesn't matter if I search for orders within 2 days or 2 years.

The simplified DDL contains a compound index on order (created, id):

CREATE TABLE public.orders (
    id int4 NOT NULL DEFAULT nextval('orders_id_seq'::regclass),
    created timestamp NOT NULL,
    total_price int8 NOT NULL,
    CONSTRAINT orders_pkey PRIMARY KEY (id)
};
create
    unique index order_created_id_idx on
    orders
        using btree(
        created desc,
        id
    ) ;
          (
            total_price is not null
        )
        or(
            total_price <> 0
        )
    ) ;
create
    unique index order_total_price_id_idx_tx_id on
    orders
        using btree(
        total_price desc,
        transaction_id
    )
where
    (
        total_price is not null
    ) ;

CREATE TABLE public.order_items (
    id int4 NOT NULL DEFAULT nextval('order_items_id_seq'::regclass),
    order_id int8 NOT NULL,
    article_name text NULL,
    CONSTRAINT order_items_pkey PRIMARY KEY (id),
    CONSTRAINT order_items_order_fk FOREIGN KEY (order_id) REFERENCES public.orders(id) ON DELETE CASCADE
)

create
    index order_items_article_name_with_like_support_idx on
    order_items
        using btree(
        lower( article_name ) varchar_pattern_ops;
create
    index order_item_article_name_trgm_idx on
    order_items
        using gist(
        article_name gist_trgm_ops
    ) ;

and produces this query plan:

Limit  (cost=12575.53..12575.63 rows=40 width=12) (actual time=5334.697..5334.705 rows=40 loops=1)
  Output: orders.id, orders.created
  Buffers: shared hit=762938 read=7003
  I/O Timings: read=206.069
  ->  Sort  (cost=12575.53..12576.39 rows=342 width=12) (actual time=5334.696..5334.701 rows=40 loops=1)
        Output: orders.id, orders.created
        Sort Key: orders.created DESC
        Sort Method: top-N heapsort  Memory: 26kB
        Buffers: shared hit=762938 read=7003
        I/O Timings: read=206.069
        ->  Nested Loop  (cost=0.85..12564.72 rows=342 width=12) (actual time=54.817..5333.618 rows=1739 loops=1)
              Output: orders.id, orders.created
              Buffers: shared hit=762935 read=7003
              I/O Timings: read=206.069
              ->  Index Scan using order_item_article_name_trgm_idx on public.order_items  (cost=0.42..3907.02 rows=2520 width=8) (actual time=10.983..4908.059 rows=35000 loops=1)
                    Output: order_items.id, order_items.order_id, order_items.article_number, order_items.article_name, order_items.number_of_items, order_items.article_type, order_items.price_per_article, order_items.full_price, order_items.size, order_items.order_position
                    Index Cond: (order_items.article_name ~~* '%test%'::text)
                    Rows Removed by Index Recheck: 42821
                    Buffers: shared hit=622968 read=6719
                    I/O Timings: read=115.587
              ->  Index Scan using orders_pkey on public.orders  (cost=0.43..3.43 rows=1 width=12) (actual time=0.012..0.012 rows=0 loops=35000)
                    Output: orders.id, orders.transaction_id, orders.order_number, orders.created, orders.total_price, orders.order_state, orders.shipment_type, orders.payment_type, orders.payment_status, orders.number_of_installments, orders.customer_id, orders.shipment_address_id, orders.invoiced_date, orders.shipped_from, orders.order_origin, orders.number_of_articles, orders.checkout_time, orders.referrer
                    Index Cond: (orders.id = order_items.order_id)
                    Filter: ((orders.created >= '2011-10-02 23:00:00'::timestamp without time zone) AND (orders.created <= '2017-12-09 22:59:59.999'::timestamp without time zone) AND (orders.total_price >= '2000000'::bigint) AND (orders.total_price <= '2147483647'::bigint))
                    Rows Removed by Filter: 1
                    Buffers: shared hit=139967 read=284
                    I/O Timings: read=90.482
Planning time: 2.112 ms
Execution time: 5334.814 ms

(How) Can I optimize this use case?

Best Answer

Index on o.created - about the only thing. Indices on article_name will not work due to the leading and ending %.