Postgresql – SELECT operation with four joins results in bad performance

performancepostgresqlpostgresql-10query-performance

I'm having problems with a slow query. The query purpose is to get doers ids for X job who didn't exceed specified limits on realizations and also are in range of possible job places.

Resources

  • Query:

      SELECT DISTINCT doers.id 
      FROM doers
      JOIN doer_locations dl 
          ON dl.doer_id = doers.id
      JOIN job_places jp 
          ON (jp.lat - 0.3147625620715557) < dl.lat 
         AND (jp.lat + 0.3147625620715557) > dl.lat 
         AND (jp.lng - 0.5001626620527362) < dl.lng 
         AND (jp.lng + 0.5001626620527362) > dl.lng
      LEFT JOIN job_realizations jr 
          ON jr.job_place_id = jp.id 
         AND jr.status IN (1, 2, 3, 4)
      LEFT JOIN job_realizations jrpd 
          ON jrpd.job_place_id = jp.id 
         AND jrpd.doer_id = doers.id 
         AND jrpd.status IN (1, 2, 3, 4)
      WHERE (jp.job_id = 1 AND doers.id IS NOT NULL)
      GROUP BY doers.id, jp.id
      HAVING COUNT(DISTINCT jr.id) < jp.realizations_per_place 
         AND COUNT(DISTINCT jrpd.id) < jp.realizations_per_place_per_doer
    
  • Depesz explain

  • Raw explain analyze

  • Simplified Schema

Consideration

I'm not sure if I read the explain correctly but it seems it loses on performance especially when it calculates stuff on the run also HAVING COUNT(DISTINCT) seems pretty expensive.

Additional information

The type of both the lat and long columns is float.

Best Answer

    --I think this is mandatory for your query performance:
    --Because you do joins using this columns from parent to child 

    create index on doer_locations(doer_id);
    create index on job_realizations(job_place_id);
    create index on job_realizations(doer_id);

    --Maybe very big and slowdown other operations... 
    --create index on job_realizations(lat);
    --create index on doer_locations(lat); 
    --create index on job_realizations(lng);
    --create index on doer_locations(lng); 

    --Maybe not mandatory: 
    create index on job_realization(realizations_per_place);
    create index on job_realization(realizations_per_place_per_doer); 

SELECT DISTINCT doers.id FROM doers
    JOIN doer_locations dl ON dl.doer_id = doers.id
    JOIN job_places jp ON (jp.lat - 0.3147625620715557) < dl.lat AND (jp.lat + 0.3147625620715557) > dl.lat AND (jp.lng - 0.5001626620527362) < dl.lng AND (jp.lng + 0.5001626620527362) > dl.lng
    LEFT JOIN job_realizations jr ON jr.job_place_id = jp.id
    LEFT JOIN job_realizations jrpd ON jrpd.job_place_id = jp.id AND jrpd.doer_id = doers.id
    WHERE (jp.job_id = 1 AND doers.id IS NOT NULL)
    GROUP BY doers.id, jp.id
    HAVING COUNT(DISTINCT jr.id) < jp.realizations_per_place AND COUNT(DISTINCT jrpd.id) < jp.realizations_per_place_per_doer

Please try and if it solve your problem select it as the right answer.