Postgresql – Undesirable Nest Loop vs. Hash Join in PostgreSQL 9.6

join;performancepostgresqlpostgresql-9.6query-performancerow-level-security

I've a trouble with PostgreSQL 9.6 query planning. My query looks like this:

SET role plain_user;

SELECT properties.*
FROM properties
JOIN entries_properties
  ON properties.id = entries_properties.property_id
JOIN structures
  ON structures.id = entries_properties.entry_id 
WHERE structures."STRUKTURBERICHT" != ''
  AND properties."COMPOSITION" LIKE 'Mo%'
  AND (
    properties."NAME" LIKE '%VASP-ase-preopt%'
    OR properties."CALCULATOR_ID" IN (7,22,25)
  )
AND properties."TYPE_ID" IN (6)

I have Row-Level Security enabled for the above-used tables.

with set enable_nestloop = True, the query planner runs
Nested Loop joining with a total running time of about 37 seconds: https://explain.depesz.com/s/59BR
with set enable_nestloop = False, the Hash Join method is used and the query time is about 0.3 sec: https://explain.depesz.com/s/PG8E

I did VACUUM ANALYZE before running the queries, but it didn't help.

I know that it is not a good practice to set enable_nestloop = False, and any other similar options for the planner. But how could I "convince" the planner to use hash joins without disabling nested loops?

Rewriting the query is an option.

If I run the same query under a role that bypasses RLS, then it is executed very fast. The row-level security policy looks like this:

CREATE POLICY properties_select
ON properties
FOR SELECT
USING (
  (
    properties.ouid = get_current_user_id()
    AND properties.ur
  )
  OR (
    properties.ogid in (select get_current_groups_id())
    AND properties.gr
  )
  OR properties.ar
);

Any ideas or suggestions would be greatly appreciated.

Best Answer

What's happening here is the Nested Loop is way off on one side. Nested Loops work really well when one side is very small, such as returning one row. In your query, the planner fumbles here and estimates that a Hash Join will return just one row. Instead, that Hash Join (property_id = id) returns 1,338 rows. This forces 1,338 loops to run on the other side of the Nested Loop which already has 3,444 rows. That's a hella-lot when you're only expecting one (which isn't even much of a "loop"). Anywayy..

Further examination as we move down shows that the Hash Join is really borked by the estimations arising from this,

Filter: (((properties."COMPOSITION")::text ~~ 'Mo%'::text) AND (((properties."NAME")::text ~~ '%VASP-ase-preopt%'::text) OR (properties."CALCULATOR_ID" = ANY ('{7,22,25}'::integer[]))))

PostgreSQL expects that to return one row. But it doesn't. And, that's really your problem. So some options here, that don't involve taking out a sledge hammer and disabling nested_loop

You can add an index or two to properties to help it potentially skip the seq scan entirely, or better estimate the return.

CREATE INDEX ON properties USING ( "TYPE_ID", "CALCULATOR_ID" );
-- the gist_trgm_ops may or may not be needed depending on selectivity of above.
CREATE INDEX ON properties USING GIST (
  "COMPOSITION" gist_trgm_ops,
  "NAME"        gist_trgm_ops
);
ANALYZE properties;

Alternatively, you can move the properties stuff to a CTE or subselect with OFFSET 0 which creates a fence.

WITH t AS (
  SELECT *
  FROM properties.
  WHERE "COMPOSITION" LIKE 'Mo%'
  AND (
    "NAME" LIKE '%VASP-ase-preopt%'
    OR "CALCULATOR_ID" IN (7,22,25)
  )
  AND "TYPE_ID" IN (6)
)
SELECT * FROM structures
JOIN t ON (
  structures.id = entries_properties.entry_id
)

Related Solutions

Forcing Oracle to use hash join for a subquery

I'd try using a WITH clause in combination with the MATERIALIZE hint to force materialization of the subquery first. Something like this:

WITH x as (select /*+ MATERIALIZE */
           from [your subquery join])
SELECT *
FROM table0, x
WHERE table0.id =x.id

Postgresql – Have Postgresql query planner use nested loop w/ indices over hash join

This closely related answer on SO should provide answers to your primary question:
Setting enable_seqscan = off in a single SELECT query

You could use in similar fashion, to disable hash joins for the current transaction:

SET LOCAL enable_hashjoin=off;

But that's not my advice. Read the answer over there.
And this one about statistics and cost settings, too.

More importantly, untangle your query first:

SELECT creation_epoch, user_screen_name, chunk
FROM  (
   SELECT id AS owner_user_id
   FROM   users
   WHERE  reputation > 100000
   ORDER  BY reputation 
   LIMIT  500
   ) u
JOIN   posts p USING (owner_user_id)
JOIN   post_tokenized t USING (id)
WHERE  type = 'tag'
AND    user_screen_name IS NOT NULL;

Should be considerably faster and also make it easier for the query planner to choose the best plan (given sane cost settings and table statistics).

Best Answer

Related Solutions

Forcing Oracle to use hash join for a subquery

Postgresql – Have Postgresql query planner use nested loop w/ indices over hash join

Related Question