PostgreSQL – WITH Clause Not Executing First

ctepostgresqlpostgresql-9.5

I am working in PostgreSQL 9.5 on a SQL query that pulls from two views. Individually, the views take 93 msec and 530 msec to perform, resulting in ~1500 and ~6200 rows, respectively. This seems reasonable given the processing going into the view, which includes some spatial intersections with PostGIS. However, when I perform the query below, it takes ~5 seconds to perform.

From what I understand, using WITH should force the subqueries to be executed once each prior to evaluation in the query below, which should result in a total time under one second rather than five seconds.

If I link to the two views through ODBC/MS Access and use MS Access to perform the query without WITH, it executes in less than a second. This makes me think that the subqueries are not being performed once each prior to joining when I execute them in pgAdmin.

No other CTEs in the views, just subqueries.

Is there a better way to do this, or a hack to force the order of execution?

WITH 
  eu AS (
    SELECT
      e.sample_pk,
      .....
    FROM
      public.samples_exp_unit AS e
    ),
  sr AS (
    SELECT
      r.sample_pk,
      .....
    FROM
      public.sample_results AS r
    )

SELECT
  sr.sample_pk,
  eu.inventory_pk,
  .....
FROM sr INNER JOIN eu
ON sr.sample_pk = eu.sample_pk

Execution plan of full query:

"Hash Join  (cost=963.49..1007.02 rows=9 width=120) (actual time=40.508..54.209 rows=45392 loops=1)"
"  Hash Cond: (sr.sample_pk = eu.sample_pk)"
"  CTE eu"
"    ->  Nested Loop  (cost=41.81..50.67 rows=1 width=24) (actual time=1.296..8.307 rows=1538 loops=1)"
"          Join Filter: (lil.locations_pk = l.locations_pk)"
"          ->  Nested Loop  (cost=41.67..48.22 rows=1 width=28) (actual time=1.289..6.582 rows=1538 loops=1)"
"                ->  Nested Loop  (cost=41.53..47.92 rows=1 width=32) (actual time=1.283..4.852 rows=1538 loops=1)"
"                      ->  Hash Join  (cost=41.38..47.66 rows=1 width=28) (actual time=1.275..3.079 rows=1480 loops=1)"
"                            Hash Cond: (lil.locations_pk = eu_1.locations_pk)"
"                            Join Filter: ((((sd.sample_depth_bottom)::double precision - eu_1.poc_top) > '-0.0001'::double precision) AND ((eu_1.poc_bottom - (sd.sample_depth_top)::double precision) > '-0.0001'::double precision))"
"                            Rows Removed by Join Filter: 371"
"                            ->  Hash Join  (cost=40.23..46.36 rows=5 width=29) (actual time=1.252..1.402 rows=610 loops=1)"
"                                  Hash Cond: (sd.inventory_pk = i.inventory_pk)"
"                                  ->  Seq Scan on sample_description sd  (cost=0.00..5.25 rows=225 width=17) (actual time=0.004..0.049 rows=225 loops=1)"
"                                  ->  Hash  (cost=40.19..40.19 rows=3 width=16) (actual time=1.245..1.245 rows=418 loops=1)"
"                                        Buckets: 1024  Batches: 1  Memory Usage: 28kB"
"                                        ->  Hash Join  (cost=35.92..40.19 rows=3 width=16) (actual time=1.109..1.194 rows=418 loops=1)"
"                                              Hash Cond: (i.inv_loc_pk = lil.inv_loc_pk)"
"                                              ->  Seq Scan on inv i  (cost=0.00..3.63 rows=163 width=8) (actual time=0.004..0.025 rows=163 loops=1)"
"                                              ->  Hash  (cost=35.88..35.88 rows=3 width=8) (actual time=1.102..1.102 rows=390 loops=1)"
"                                                    Buckets: 1024  Batches: 1  Memory Usage: 24kB"
"                                                    ->  Subquery Scan on lil  (cost=35.82..35.88 rows=3 width=8) (actual time=1.002..1.056 rows=390 loops=1)"
"                                                          ->  HashAggregate  (cost=35.82..35.85 rows=3 width=8) (actual time=1.001..1.037 rows=390 loops=1)"
"                                                                Group Key: "*SELECT* 1".locations_pk, "*SELECT* 1".inv_loc_pk"
"                                                                ->  Append  (cost=0.14..35.81 rows=3 width=8) (actual time=0.876..0.929 rows=390 loops=1)"
"                                                                      ->  Result  (cost=0.14..34.78 rows=2 width=8) (actual time=0.876..0.918 rows=389 loops=1)"
"                                                                            ->  HashSetOp Except  (cost=0.14..34.78 rows=2 width=8) (actual time=0.876..0.892 rows=389 loops=1)"
"                                                                                  ->  Append  (cost=0.14..34.77 rows=3 width=8) (actual time=0.028..0.800 rows=389 loops=1)"
"                                                                                        ->  Subquery Scan on "*SELECT* 1"  (cost=0.14..33.75 rows=2 width=8) (actual time=0.028..0.785 rows=389 loops=1)"
"                                                                                              ->  Nested Loop  (cost=0.14..33.73 rows=2 width=8) (actual time=0.026..0.759 rows=389 loops=1)"
"                                                                                                    ->  Seq Scan on locations l_1  (cost=0.00..2.37 rows=37 width=36) (actual time=0.003..0.012 rows=38 loops=1)"
"                                                                                                    ->  Index Scan using sidx_inv_loc_geom on inv_loc i_1  (cost=0.14..0.84 rows=1 width=36) (actual time=0.008..0.018 rows=10 loops=38)"
"                                                                                                          Index Cond: (geom && l_1.geom)"
"                                                                                                          Filter: _st_intersects(geom, l_1.geom)"
"                                                                                                          Rows Removed by Filter: 0"
"                                                                                        ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..1.02 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)"
"                                                                                              ->  Seq Scan on loc_inv_adjust a  (cost=0.00..1.01 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)"
"                                                                                                    Filter: (include = 0)"
"                                                                                                    Rows Removed by Filter: 1"
"                                                                      ->  Seq Scan on loc_inv_adjust b  (cost=0.00..1.01 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)"
"                                                                            Filter: (include = 1)"
"                            ->  Hash  (cost=1.07..1.07 rows=7 width=24) (actual time=0.009..0.009 rows=11 loops=1)"
"                                  Buckets: 1024  Batches: 1  Memory Usage: 9kB"
"                                  ->  Seq Scan on exp_unit eu_1  (cost=0.00..1.07 rows=7 width=24) (actual time=0.004..0.006 rows=11 loops=1)"
"                      ->  Index Scan using bysampledescription on sample s  (cost=0.14..0.25 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=1480)"
"                            Index Cond: (sample_description_pk = sd.sample_description_pk)"
"                ->  Index Only Scan using inv_loc_pkey on inv_loc il  (cost=0.14..0.29 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1538)"
"                      Index Cond: (inv_loc_pk = i.inv_loc_pk)"
"                      Heap Fetches: 1538"
"          ->  Index Only Scan using pk_locations on locations l  (cost=0.14..2.44 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1538)"
"                Index Cond: (locations_pk = eu_1.locations_pk)"
"                Heap Fetches: 1538"
"  CTE sr"
"    ->  WindowAgg  (cost=848.77..894.49 rows=1829 width=92) (actual time=31.316..35.704 rows=6234 loops=1)"
"          ->  Sort  (cost=848.77..853.34 rows=1829 width=92) (actual time=31.312..31.621 rows=6234 loops=1)"
"                Sort Key: sub_query_union.sample_pk, sub_query_union.analyte_pk, sub_query_union.type_priority, sub_query_union.num_priority"
"                Sort Method: quicksort  Memory: 1069kB"
"                ->  Subquery Scan on sub_query_union  (cost=713.08..749.66 rows=1829 width=92) (actual time=26.511..28.164 rows=6234 loops=1)"
"                      ->  HashAggregate  (cost=713.08..731.37 rows=1829 width=106) (actual time=26.510..27.750 rows=6234 loops=1)"
"                            Group Key: "*SELECT* 1_1".sample_pk, "*SELECT* 1_1".analyte_pk, "*SELECT* 1_1".results_pk, (1), "*SELECT* 1_1".result, "*SELECT* 1_1".units_pk, (NULL::double precision), (NULL::double precision), ('Rep_Conc'::character varying), (...)"
"                            ->  Append  (cost=46.85..658.21 rows=1829 width=106) (actual time=3.748..22.017 rows=6234 loops=1)"
"                                  ->  Subquery Scan on "*SELECT* 1_1"  (cost=46.85..62.83 rows=1 width=24) (actual time=3.748..5.539 rows=134 loops=1)"
"                                        ->  Nested Loop  (cost=46.85..62.82 rows=1 width=24) (actual time=3.746..5.485 rows=134 loops=1)"
"                                              Join Filter: (lil_1.locations_pk = l_2.locations_pk)"
"                                              ->  Nested Loop  (cost=46.71..62.60 rows=1 width=32) (actual time=3.736..4.981 rows=134 loops=1)"
"                                                    ->  Nested Loop  (cost=46.57..62.31 rows=1 width=40) (actual time=3.727..4.584 rows=134 loops=1)"
"                                                          ->  Hash Join  (cost=46.42..62.05 rows=1 width=40) (actual time=3.718..4.177 rows=128 loops=1)"
"                                                                Hash Cond: (rc.locations_pk = lil_1.locations_pk)"
"                                                                Join Filter: ((((sd_1.sample_depth_bottom)::double precision - rc.depth_top) > '-0.0001'::double precision) AND ((rc.depth_bottom - (sd_1.sample_depth_top)::double precision) > '-0.0001'::doub (...)"
"                                                                Rows Removed by Join Filter: 94"
"                                                                ->  Seq Scan on rep_conc rc  (cost=0.00..13.60 rows=360 width=40) (actual time=0.005..0.006 rows=1 loops=1)"
"                                                                ->  Hash  (cost=46.36..46.36 rows=5 width=25) (actual time=3.683..3.683 rows=610 loops=1)"
"                                                                      Buckets: 1024  Batches: 1  Memory Usage: 43kB"
"                                                                      ->  Hash Join  (cost=40.23..46.36 rows=5 width=25) (actual time=3.124..3.444 rows=610 loops=1)"
"                                                                            Hash Cond: (sd_1.inventory_pk = i_2.inventory_pk)"
"                                                                            ->  Seq Scan on sample_description sd_1  (cost=0.00..5.25 rows=225 width=17) (actual time=0.006..0.093 rows=225 loops=1)"
"                                                                            ->  Hash  (cost=40.19..40.19 rows=3 width=16) (actual time=3.111..3.111 rows=418 loops=1)"
"                                                                                  Buckets: 1024  Batches: 1  Memory Usage: 28kB"
"                                                                                  ->  Hash Join  (cost=35.92..40.19 rows=3 width=16) (actual time=2.780..2.976 rows=418 loops=1)"
"                                                                                        Hash Cond: (i_2.inv_loc_pk = lil_1.inv_loc_pk)"
"                                                                                        ->  Seq Scan on inv i_2  (cost=0.00..3.63 rows=163 width=8) (actual time=0.007..0.059 rows=163 loops=1)"
"                                                                                        ->  Hash  (cost=35.88..35.88 rows=3 width=8) (actual time=2.761..2.761 rows=390 loops=1)"
"                                                                                              Buckets: 1024  Batches: 1  Memory Usage: 24kB"
"                                                                                              ->  Subquery Scan on lil_1  (cost=35.82..35.88 rows=3 width=8) (actual time=2.522..2.651 rows=390 loops=1)"
"                                                                                                    ->  HashAggregate  (cost=35.82..35.85 rows=3 width=8) (actual time=2.522..2.600 rows=390 loops=1)"
"                                                                                                          Group Key: "*SELECT* 1_2".locations_pk, "*SELECT* 1_2".inv_loc_pk"
"                                                                                                          ->  Append  (cost=0.14..35.81 rows=3 width=8) (actual time=2.207..2.334 rows=390 loops=1)"
"                                                                                                                ->  Result  (cost=0.14..34.78 rows=2 width=8) (actual time=2.207..2.304 rows=389 loops=1)"
"                                                                                                                      ->  HashSetOp Except  (cost=0.14..34.78 rows=2 width=8) (actual time=2.206..2.244 rows=389 loops=1)"
"                                                                                                                            ->  Append  (cost=0.14..34.77 rows=3 width=8) (actual time=0.048..1.986 rows=389 loops=1)"
"                                                                                                                                  ->  Subquery Scan on "*SELECT* 1_2"  (cost=0.14..33.75 rows=2 width=8) (actual time=0.048..1.954 rows=389 loops=1)"
"                                                                                                                                        ->  Nested Loop  (cost=0.14..33.73 rows=2 width=8) (actual time=0.047..1.889 rows=389 loops=1)"
"                                                                                                                                              ->  Seq Scan on locations l_3  (cost=0.00..2.37 rows=37 width=36) (actual time=0.005..0.020 rows=38 loops=1)"
"                                                                                                                                              ->  Index Scan using sidx_inv_loc_geom on inv_loc i_3  (cost=0.14..0.84 rows=1 width=36) (actual time=0.019..0.046 (...)"
"                                                                                                                                                    Index Cond: (geom && l_3.geom)"
"                                                                                                                                                    Filter: _st_intersects(geom, l_3.geom)"
"                                                                                                                                                    Rows Removed by Filter: 0"
"                                                                                                                                  ->  Subquery Scan on "*SELECT* 2_2"  (cost=0.00..1.02 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=1)"
"                                                                                                                                        ->  Seq Scan on loc_inv_adjust a_1  (cost=0.00..1.01 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=1)"
"                                                                                                                                              Filter: (include = 0)"
"                                                                                                                                              Rows Removed by Filter: 1"
"                                                                                                                ->  Seq Scan on loc_inv_adjust b_1  (cost=0.00..1.01 rows=1 width=8) (actual time=0.002..0.003 rows=1 loops=1)"
"                                                                                                                      Filter: (include = 1)"
"                                                          ->  Index Scan using bysampledescription on sample s_1  (cost=0.14..0.25 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=128)"
"                                                                Index Cond: (sample_description_pk = sd_1.sample_description_pk)"
"                                                    ->  Index Only Scan using inv_loc_pkey on inv_loc il_1  (cost=0.14..0.29 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=134)"
"                                                          Index Cond: (inv_loc_pk = i_2.inv_loc_pk)"
"                                                          Heap Fetches: 134"
"                                              ->  Index Only Scan using pk_locations on locations l_2  (cost=0.14..0.20 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=134)"
"                                                    Index Cond: (locations_pk = rc.locations_pk)"
"                                                    Heap Fetches: 134"
"                                  ->  Subquery Scan on "*SELECT* 2_1"  (cost=266.84..296.55 rows=914 width=106) (actual time=3.855..7.460 rows=1862 loops=1)"
"                                        ->  WindowAgg  (cost=266.84..287.41 rows=914 width=106) (actual time=3.853..7.089 rows=1862 loops=1)"
"                                              ->  Sort  (cost=266.84..269.13 rows=914 width=106) (actual time=3.844..3.959 rows=1862 loops=1)"
"                                                    Sort Key: ar.sample_pk, ar.analyte_pk, ar.result DESC"
"                                                    Sort Method: quicksort  Memory: 196kB"
"                                                    ->  Nested Loop  (cost=44.81..221.89 rows=914 width=106) (actual time=0.309..2.041 rows=1862 loops=1)"
"                                                          ->  Seq Scan on detected d  (cost=0.00..1.09 rows=1 width=8) (actual time=0.004..0.007 rows=3 loops=1)"
"                                                                Filter: (detected_yn = 1)"
"                                                                Rows Removed by Filter: 4"
"                                                          ->  Bitmap Heap Scan on analytical_results ar  (cost=44.81..199.48 rows=2133 width=106) (actual time=0.103..0.573 rows=621 loops=3)"
"                                                                Recheck Cond: (detected_pk = d.detected_pk)"
"                                                                Heap Blocks: exact=127"
"                                                                ->  Bitmap Index Scan on by_detected  (cost=0.00..44.28 rows=2133 width=0) (actual time=0.088..0.088 rows=665 loops=3)"
"                                                                      Index Cond: (detected_pk = d.detected_pk)"
"                                  ->  Subquery Scan on "*SELECT* 3"  (cost=266.84..298.83 rows=914 width=106) (actual time=5.384..8.761 rows=4238 loops=1)"
"                                        ->  WindowAgg  (cost=266.84..289.69 rows=914 width=106) (actual time=5.384..8.425 rows=4238 loops=1)"
"                                              ->  Sort  (cost=266.84..269.13 rows=914 width=106) (actual time=5.379..5.477 rows=4238 loops=1)"
"                                                    Sort Key: ar_1.sample_pk, ar_1.analyte_pk, ar_1.reporting_limit, ar_1.method_detection_limit"
"                                                    Sort Method: quicksort  Memory: 524kB"
"                                                    ->  Nested Loop  (cost=44.81..221.89 rows=914 width=106) (actual time=0.562..2.467 rows=4238 loops=1)"
"                                                          ->  Seq Scan on detected d_1  (cost=0.00..1.09 rows=1 width=8) (actual time=0.004..0.005 rows=4 loops=1)"
"                                                                Filter: (detected_yn = 0)"
"                                                                Rows Removed by Filter: 3"
"                                                          ->  Bitmap Heap Scan on analytical_results ar_1  (cost=44.81..199.48 rows=2133 width=106) (actual time=0.139..0.524 rows=1060 loops=4)"
"                                                                Recheck Cond: (detected_pk = d_1.detected_pk)"
"                                                                Heap Blocks: exact=128"
"                                                                ->  Bitmap Index Scan on by_detected  (cost=0.00..44.28 rows=2133 width=0) (actual time=0.129..0.129 rows=1093 loops=4)"
"                                                                      Index Cond: (detected_pk = d_1.detected_pk)"
"  ->  CTE Scan on sr  (cost=0.00..36.58 rows=1829 width=100) (actual time=31.319..38.032 rows=6234 loops=1)"
"  ->  Hash  (cost=0.02..0.02 rows=1 width=24) (actual time=9.179..9.179 rows=1538 loops=1)"
"        Buckets: 2048 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 101kB"
"        ->  CTE Scan on eu  (cost=0.00..0.02 rows=1 width=24) (actual time=1.298..8.909 rows=1538 loops=1)"
"Planning time: 10.514 ms"
"Execution time: 56.328 ms"

Execution plan of the view samples_exp_unit: (including all 88 rows takes the post over the character limit)

"Planning time: 5.594 ms"
"Execution time: 58.011 ms"

Execution times of 4.9 seconds from executing in pgAdmin (version III and IV both take this amount of time). When I post the SQL into MS Access as a pass-through query, it executes in hardly any time at all. I am starting to think this is a pgAdmin issue rather than Postgres. Postgres is running on the local machine for development, so it shouldn't be the network.

Best Answer

Since public.samples_exp_unit and public.sample_results are the views, you have added a level of indirection in your queries.

You can rewrite to be simpler:

WITH 
  eu AS (TABLE public.samples_exp_unit),
  sr AS (TABLE public.sample_results)
SELECT
  sr.sample_pk,
  eu.inventory_pk,
  .....
FROM sr INNER JOIN eu
  ON sr.sample_pk = eu.sample_pk ;

and test if this changes the plan and improves efficiency.

Another thing to test is to avoid CTEs. Especially because they are an optimization fence, they sometimes result in less efficient plans (and sometimes in more efficient plans!) Rewrite and test without a CTE:

SELECT
  sr.sample_pk,
  eu.inventory_pk,
  .....
FROM public.sample_results AS sr 
  INNER JOIN 
     public.samples_exp_unit AS eu
  ON sr.sample_pk = eu.sample_pk ;

Also try the queries through psql. It will show if PgAdmin is the issue.

It could be that pgAdmin is taking a long time to render the results of the query, or is talking over a slow network and so takes a long time to fetch the results. It is also possible pgAdmin is doing something that changes the execution plan, like silently setting some parameters to non-default values, or preparing the query into a cursor before execution.