Postgresql – Calculate value differences of rows before and after bounded by multiple referenced rows

postgresqlwindow functions

Supposed I have a user travelling table like this:


id | start_time | end_time | location |
----+--------------+------------+------------+
1 | 1 | 2 | Loc_A |
1 | 3 | 4 | Loc_B |
1 | 5 | 6 | Loc_A |
1 | 7 | 8 | Loc_C |
1 | 9 | 10 | Loc_A |
1 | 13 | 15 | Loc_B |
1 | 16 | 17 | Loc_C |
2 | 1 | 3 | Loc_D |

Supposedly we are using "Loc_B" as the reference point – the point that the user must pass. We will have 2 nodes @ "Loc_B". Left tree contains all the rows before the current reference row, right tree contains all the rows after the current reference row (i.e Loc_B @ row 2) till the next reference row (i.e Loc_B @ row 6).

For the 1st node (row number 2):

left tree [ Loc_A ]
right tree [ Loc_A, Loc_C, Loc_A ].

For the 2nd node (row number 2),

left tree [ Loc_A, Loc_C, Loc_A ]
right tree [ Loc_C ].

Time diff is calculated as follow: if row is in left tree, then reference row's start time minus the inspecting row's end time. The sign of value will be minus. If row is in the right tree, then diff is equal to the inspecting row's start time minus the reference row's end time.

Time diff value for 1st node will be:

left_tree [ -1],
right_tree [ +1, +3, +5].

Time diff value for 2nd node will be:

left_tree [ -7, -5, -3 ],
right_tree [ +1 ].

Then the final diff value column, we have to compare 2 consecutive nodes with each other in pair and compare the node[0]'s right tree with node[1] 's left tree. Taking the absolute minimum value and append the sign based on its original value (or whether it is in left or right tree)

For user id – 2, diff will be NIL since he didn't pass through the reference point (Loc_B)


id | start_time | end_time | location | diff |
----+--------------+------------+------------+---------
1 | 1 | 2 | Loc_A | -1
1 | 3 | 4 | Loc_B | N/A
1 | 5 | 6 | Loc_A | +1
1 | 7 | 8 | Loc_C | +3
1 | 9 | 10 | Loc_A | -3
1 | 13 | 15 | Loc_B | N/A
1 | 16 | 17 | Loc_C | +1
2 | 1 | 3 | Loc_D | NIL

Other note: left_tree and right_tree can be empty in case of Loc_B is the first or last record (partition by id).

May I know whether it is possible to do this?

Best Answer

SELECT *
      ,CASE WHEN location = 'Loc_B' OR right_tree = 0 AND left_tree = 0 THEN NULL::int
            ELSE CASE WHEN @right_diff < @left_diff THEN right_diff ELSE left_diff END
       END AS min_diff
FROM  (
   SELECT *
         ,CASE WHEN right_tree > 0 THEN start_time - right_end ELSE 1000 END AS right_diff
         ,CASE WHEN left_tree  > 0 THEN end_time  - left_start ELSE 1000 END AS left_diff
   FROM  (
      SELECT *
            ,first_value(end_time)   OVER (PARTITION BY right_tree ORDER BY start_time) AS right_end
            ,first_value(start_time) OVER (PARTITION BY left_tree  ORDER BY start_time DESC) AS left_start
      FROM  (
         SELECT *
               ,count(location = 'Loc_B' OR NULL) OVER (PARTITION BY id ORDER BY start_time) AS right_tree
               ,count(location = 'Loc_B' OR NULL) OVER (PARTITION BY id ORDER BY start_time DESC) AS left_tree
         FROM   travel
         ) a
      ) b
   ) c
   ORDER  BY id, start_time;

Produces your result exactly.

1000 is just some "higher value than any other". Since your actual problem seems to operate with times and intervals infinity would be the perfect choice.

Related answer with detailed explanation how groups (right_tree and left_tree here) are formed:
Select longest continuous sequence

SQL Fiddle.