PostgreSQL – Running a CTE Query in a Loop Using PL/pgSQL

plpgsqlpostgresql

I'm trying to execute query that is repeatedly called in a loop using plpgsql -the loop iterates over another table (named coordinates) that contains top left and bottom right latitude/longitude coordinates of grids, I pass the top left and bottom right latitude / longitude values into my CTE in order to display the amount of requests (hourly) made within those coordinates for given two timestamps-. However, I cannot display the results of my CTE and I get the following error message:

ERROR:  query has no destination for result data
HINT:  If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT:  PL/pgSQL function "inline_code_block" line 6 at SQL statement

What should I change here in order to make the entire query work as needed? My code is as below:

DO $$
<<outer_scope>> DECLARE
  coords RECORD;
BEGIN
    FOR coords IN SELECT topleftlat, topleftlon, bottomrightlat, bottomrightlon FROM coordinates LOOP
        WITH cal AS (
        SELECT generate_series('2011-02-02 00:00:00'::timestamp ,
                   '2012-04-01 05:00:00'::timestamp , 
                   '1 hour'::interval) AS stamp
    ),
    qqq AS (
      SELECT date_trunc('hour', calltime) AS stamp, count(*) AS zcount
      FROM mytable
      WHERE calltime >= '2011-02-13 11:55:11' 
        AND calltime <= '2012-02-13 01:02:21'
        AND (calltime::time >= '11:55:11' 
        OR calltime::time <= '01:02:21')
        AND lat BETWEEN coords.bottomrightlat AND coords.topleftlat
        AND lon BETWEEN coords.topleftlon AND coords.bottomrightlon
     GROUP BY date_trunc('hour', calltime)
    )
    SELECT cal.stamp, COALESCE (qqq.zcount, 0) AS zcount
    FROM cal
    LEFT JOIN qqq ON cal.stamp = qqq.stamp
    WHERE cal.stamp >= '2011-02-13 11:00:00' 
      AND cal.stamp <= '2012-02-13 01:02:21' 
      AND (
        extract ('hour' from cal.stamp) >= extract ('hour' from '2011-02-13 11:00:00'::timestamp) or
        extract ('hour' from cal.stamp) <= extract ('hour' from '2012-02-13 01:02:21'::timestamp) 
      )
    ORDER BY stamp ASC;

    END LOOP;
END;
$$;

Best Answer

The DO command has no facility to actually return data (except with RAISE, or you could write to a (temp) table .. ).

You need to create a PL/pgSQL function that can define a return type with RETURNS and call it.

You could return the result with RETURN QUERY EXECUTE. But I suspect the whole operation can be simplified ...

Rewrite as single SQL query

You probably don't need plpgsql or loops at all. Consider this plain SQL query instead:

WITH v AS (
   SELECT '2011-02-13 11:55:11'::timestamp AS _from -- provide times once
         ,'2012-02-13 01:02:21'::timestamp AS _to
   )
, q AS (
   SELECT c.coordinates_id
        , date_trunc('hour', t.calltime) AS stamp
        , count(*) AS zcount
   FROM   v
   JOIN   mytable t ON  t.calltime BETWEEN v._from AND v._to
                   AND (t.calltime::time >= v._from::time OR
                        t.calltime::time <= v._to::time)
   JOIN   coordinates c ON (t.lat, t.lon) 
                   BETWEEN (c.bottomrightlat, c.topleftlon)
                       AND (c.topleftlat, c.bottomrightlon)
   GROUP  BY c.coordinates_id, date_trunc('hour', t.calltime)
   )
, cal AS (
   SELECT generate_series(GREATEST('2011-02-02 00:00:00'::timestamp, v._from)
                        , LEAST('2012-04-01 05:00:00'::timestamp, v._to)
                        , '1 hour'::interval) AS stamp
   FROM v
   )
SELECT q.coordinates_id, cal.stamp, COALESCE (q.zcount, 0) AS zcount
FROM   v, cal
LEFT   JOIN q USING (stamp)
WHERE (cal.stamp::time >= v._from::time OR
       cal.stamp::time <= v._to::time)
ORDER  BY q.coordinates_id, stamp;
  • Instead of looping through rows in table coordinates, join to the CTE and produce the whole result in one go.

  • As you aggregate per row of coordinates we need the primary key of this table (or any other unique set of columns) I assume a pk named coordinates_id.

  • I added the CTE v (for "values") on top to provide _from and _to timestamps once only.

  • I use _from and _to to limit the time range of the calender right away, instead of adding WHERE clauses to trim the surplus in the final SELECT.

    GREATEST('2011-02-02 00:00:00'::timestamp, v._from)
    LEAST('2012-04-01 05:00:00'::timestamp, v._to)
    
  • I use "ad-hoc rows" like demonstrated in this related answer by @kgrittn for a much simpler JOIN condition:

         ON (t.lat, t.lon) 
    BETWEEN (c.bottomrightlat, c.topleftlon)
        AND (c.topleftlat, c.bottomrightlon)
    
  • I cast to time (::time) instead of using extract ('hour' ..), because it's simpler and faster.

I am not 100 % sure this is exactly what you are after, but it should be very close.

Related Question