Postgresql – adding an index on different fields

indexindex-tuningpostgresql

I have a query like this:

SELECT "btr_orders".*
FROM "btr_orders" 
WHERE "btr_orders"."user_id" = 40000000
  AND (
    premium_until > now()
    OR intermediate_premium_until > now()
  )
  AND (
    active_since IS NULL
    OR active_since <= now()
  )
  AND "btr_orders"."temporary" = false  
ORDER BY "btr_orders"."id" DESC
LIMIT 1;

this query takes around one two seconds to run, and we have at least 10 to 20 clients running this query over and over

the output of the EXPLAIN ANALYZE is this:

Limit  (cost=6192.97..6192.98 rows=1 width=244) (actual time=1086.744..1086.744 rows=0 loops=1)
  ->  Sort  (cost=6192.97..6192.98 rows=1 width=244) (actual time=1086.743..1086.743 rows=0 loops=1)
        Sort Key: id
        Sort Method: quicksort  Memory: 25kB
        ->  Seq Scan on btr_orders  (cost=0.00..6192.96 rows=1 width=244) (actual time=1086.736..1086.736 rows=0 loops=1)
              Filter: ((NOT temporary) AND ((active_since IS NULL) OR (active_since <= '2017-06-13 14:59:29.528395'::timestamp without time zone)) AND (user_id = 40000000) AND ((premium_until > '2017-06-13 14:59:29.527964'::timestamp without time zone) OR (intermediate_premium_until > '2017-06-13 14:59:29.527964'::timestamp without time zone)))
              Rows Removed by Filter: 154734
Total runtime: 1086.789 ms

although the rows filtered is actually 1, this query is really slow, probably because we are doing seq. scan. I tried to create an index for this query:

 create index CONCURRENTLY btr_orders_search_index 
   on btr_orders (temporary, user_id, active_since, premium_until, intermediate_premium_until) 
   where temporary = false;

but of course didn't worked. I have a very limited knowledge on postgres, actually in database administration (I'm a backend developer) and we do not have a DBA.
My question is: how can I create an index that fits this query?

Edit

\d btr_orders

zenguard_api=# \d btr_orders
                                             Table "public.btr_orders"
           Column           |            Type             |                        Modifiers
----------------------------+-----------------------------+---------------------------------------------------------
 id                         | integer                     | not null default nextval('btr_orders_id_seq'::regclass)
 active_since               | timestamp without time zone |
 affiliate_cookies          | hstore                      |
 affiliate_net_amount       | integer                     |
 affiliate_partner_id       | integer                     |
 closed_since               | timestamp without time zone |
 country_code               | character varying(255)      |
 created_at                 | timestamp without time zone |
 discount_id                | integer                     |
 first_charge_at_activation | boolean                     | default false
 intermediate_premium_until | timestamp without time zone |
 ip_country_code            | character varying(255)      |
 merchant_account           | character varying(255)      | default 'ZenGuard'::character varying
 order_reference            | character varying(255)      |
 paid_premium_until         | timestamp without time zone |
 premium_since              | timestamp without time zone |
 premium_until              | timestamp without time zone |
 price_id                   | integer                     |
 product_id                 | integer                     |
 purchase_platform          | character varying(255)      |
 temporary                  | boolean                     | default false
 updated_at                 | timestamp without time zone |
 user_id                    | integer                     |
 migrated_from_old          | boolean                     | default false
Indexes:
    "btr_orders_pkey" PRIMARY KEY, btree (id)

Best Answer

As already mentioned in the comments by @JacobH, consider putting btr_orders.id in an index.

I have assumed you have a scenario similar to this one:

 CREATE TABLE users
 (
     user_id integer PRIMARY KEY
 ) ;

 CREATE TABLE btr_orders
 (
     id SERIAL PRIMARY KEY,
     active_since timestamp without time zone,
     premium_until timestamp without time zone,
     intermediate_premium_until timestamp without time zone,
     user_id integer NOT NULL REFERENCES users(user_id),
     temporary boolean NOT NULL DEFAULT true,
     some_cargo text
 ) ;

My assumption is that the column named id is actually your PRIMARY KEY, and declared it as such. By doing so, I automatically get an index on that column.

I simulate users and orders:

 INSERT INTO users (user_id)
 SELECT generate_series(1, 5000) AS user_id ;     
 INSERT INTO users (user_id) VALUES (40000000) ;

5000 rows affected
1 rows affected
 

 INSERT INTO btr_orders
     (active_since, premium_until, intermediate_premium_until, user_id, temporary)
 SELECT
     active_since, 
     active_since + interval '1 day' AS premium_until,
     active_since + interval '12 hours' AS intermediate_premium_until,
     case when random() < 0.5 then 40000000 else random()*4000+1 end AS user_id,
     random() < 0.5 AS temporary 
 FROM
     generate_series(date '2017-01-01', date '2017-12-31', interval '3 minutes') AS x(active_since);

174721 rows affected
 

The execution plan for your query is:

 | QUERY PLAN                                                                                                                                                                               |
 | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | Limit  (cost=0.42..201.70 rows=1 width=65) (actual time=25.518..25.519 rows=1 loops=1)                                                                                                   |
 |   ->  Index Scan Backward using btr_orders_pkey on btr_orders  (cost=0.42..11876.08 rows=59 width=65) (actual time=25.517..25.517 rows=1 loops=1)                                        |
 |         Filter: ((NOT temporary) AND (user_id = 40000000) AND ((active_since IS NULL) OR (active_since <= now())) AND ((premium_until > now()) OR (intermediate_premium_until > now()))) |
 |         Rows Removed by Filter: 96067                                                                                                                                                    |
 | Planning time: 0.143 ms                                                                                                                                                                  |
 | Execution time: 25.536 ms                                                                                                                                                                |
 

It takes miliseconds.

You can improve the performance by using a relatively simple partial index:

 CREATE INDEX quite_simple_idx ON btr_orders 
     (user_id, id)
 WHERE NOT temporary ;

This index will take care of:

  1. Filtering out all rows where temporary is true. The index will only be used if you filter by temporary = false as one of the factors in your WHERE conditions.

  2. Letting the database quickly seek for user_id = xxxxx.

  3. Once this user_id is found, results will retrievable sorted by id.

This changes the execution plan to this one:

 | QUERY PLAN                                                                                                                                      |
 | :---------------------------------------------------------------------------------------------------------------------------------------------- |
 | Limit  (cost=0.29..19.15 rows=1 width=65) (actual time=9.681..9.681 rows=1 loops=1)                                                             |
 |   ->  Index Scan Backward using quite_simple_idx on btr_orders  (cost=0.29..1546.49 rows=82 width=65) (actual time=9.679..9.679 rows=1 loops=1) |
 |         Index Cond: (user_id = 40000000)                                                                                                        |
 |         Filter: (((active_since IS NULL) OR (active_since <= now())) AND ((premium_until > now()) OR (intermediate_premium_until > now())))     |
 |         Rows Removed by Filter: 23963                                                                                                           |
 | Planning time: 0.179 ms                                                                                                                         |
 | Execution time: 9.700 ms                                                                                                                        |
 

If (and that's a big if) the simulated data I put in for test is similar enough to your actual scenario, you can improve by a factor of (about) 2. In any case, by just having an index on id, you already move from "seconds" to "miliseconds".

If you might have queries with both temporary = true and temporary = false (or temporary and NOT temporary), you can have two partial indexes, one with WHERE temporary and another with WHERE not temporary. You can also have one single index where you would add also the temporary column. Some experimentation is needed to find which is the best performing combination for every case (it depends a lot on the ratio of false/true and how often you change this value).

You can check all these simulations at dbfiddle here