Postgresql – Optimizing Postgres query

indexperformancepostgresqlpostgresql-performance

I've got a one to one relation from users to addresses table.
Where one user can have one search address and one verified address.

I've got two indexes on addresses table :

  • Index on state field
  • Index on user_id

I'm trying to fetch addresses for certain users only, and those addresses whose state is other than manual_verification.

Here is my query:

SELECT users.id 
FROM "users" INNER JOIN addresses 
     ON  addresses.user_id = users.id 
     and addresses.type = 'VerifiedAddress' 
WHERE ("users".deleted_at IS NULL) 
  AND (users.id in (11144,10569,21519,783,15671,21726,17787,11665,
                    19579,12226,1324,9413,5461,20981,12906) 
  and addresses.state != 'manual_verification')

Explain for the query above :
http://explain.depesz.com/s/rTj

It takes 37 ms. Sometimes more depending on the number of users.

I think this is a good query, however our team needs to investigate around this, and I'm looking for some optimization tips. I mean I did the one field select, there is a index on user_id (addresses) and state (addresses).

Is there anything else I can do/try?

Update

I found out that this query works much faster:

 SELECT "addresses"."user_id" 
    FROM "addresses" 
    WHERE "addresses"."type" IN ('VerifiedAddress') 
    AND (user_id in (9681,23824,23760,20098,962,14730,12294,9552,534,
                     553,5837,6768,6583,956,24179) and state != 'manual_verification')

Explain for this query:
http://explain.depesz.com/s/nHrr

Best Answer

I'm trying to fetch addresses for certain users only ...

IN does not scale well with big lists. Compare performance to unnest() / JOIN

SELECT a.*
FROM   unnest(ARRAY[9681,23824,23760,20098,962,14730,12294,9552,534,
                    553,5837,6768,6583,956,24179]) AS t(user_id)
JOIN   addresses a USING (user_id)
WHERE  a.type  =  'VerifiedAddress'
AND    a.state <> 'manual_verification'

I hope state and type are at least enum types. Otherwise you should normalize your database design. Have lookup-tables for state and type and only use a small IDs referencing entries there. Makes the addresses table smaller and everything faster.

Assuming all columns NOT NULL for lack of information.

Partial index

If you need it faster still, and if there are more than a few rows with type <> 'VerifiedAddress' or state = 'manual_verification', a partial index would help some more. The gain grows with the percentage of rows you can exclude from the index.

CREATE INDEX addresses_selection_idx ON addresses (user_id)
WHERE  type  =  'VerifiedAddress'
AND    state <> 'manual_verification';

Be sure to use matching WHERE conditions in your query to allow Postgres to use this index. And weigh the benefit of tailored indexes against their cost (maintenance, extra disk space, small penalty for write operations).