Postgresql – Conditional Select from Window Function

postgresqlwindow functions

I'm using lead() to calculate the difference in values between two records in the same table. This works fine until I need to not report on one of the records based on some other value in the window.

Given the following table.

>  t_id   |     update_time     |     last_reset      | bytes_tx
> ----------+---------------------+---------------------+-----------   
> 3680000 | 2019-04-28 19:58:05 | 2019-02-21 15:08:27 | 188960431  
> 3680000 | 2019-04-28 20:13:06 | 2019-02-21 15:08:27 | 188991801  
> 3680000 | 2019-04-28 20:28:08 | 2019-02-21 15:08:27 | 189022965  
> 3680000 | 2019-04-28 20:43:07 | 2019-02-21 15:08:27 | 189057066  
> 3680000 | 2019-04-28 20:58:06 | 2019-04-19 12:18:17 | 89088869  
> 3680000 | 2019-04-28 21:10:11 | 2019-04-19 12:18:17 | 89114302  
> 3680000 | 2019-04-28 21:13:03 | 2019-04-19 12:18:17 | 89120288  
> 3680000 | 2019-04-28 21:28:05 | 2019-04-19 12:18:17 | 89151994  
> 3680000 | 2019-04-28 21:43:06 | 2019-04-19 12:18:17 | 89183203  
> 3680000 | 2019-04-28 21:58:03 | 2019-04-19 12:18:17 | 89215093

I can get what I want from this using lead() to report the difference of the current record, and the upcoming record.

select 
    lead(update_time,1) over(partition by t_id order by update_time) - update_time as interval,
    lead(bytes_tx,1) over (partition by t_id order by update_time) - bytes_tx as bytes_diff
from table

 interval | bytes_tx
----------+----------
 00:15:01 |    31370
 00:15:02 |    31164
 00:14:59 |    34101

That works well right up until last_reset changes, which zeros my bytes_tx count resulting in a negative integer being returned.

I'd like to simply skip records where lead(last_reset) isn't equal to current last_reset, but I can't use a window function in the WHERE clause.

I'm essentially looking for the equivalent of

select 
    lead(update_time,1) over(partition by t_id order by update_time) - update_time as interval,
    lead(bytes_tx,1) over (partition by t_id order by update_time) - bytes_tx as bytes_diff
from 
   table
where
  lead(last_reset,1) over (partition by t_id order by update_time) = last_reset

And I'm pretty stumped right now.

Any clues for the clueless?

Best Answer

You need to put your current query into a subquery, and include an expression for the "lead last_reset" you want to put into the where clause.

Something like the following (untested):

select interval, bytes_diff
from (
  select 
    lead(update_time) over w - update_time as interval,
    lead(bytes_tx) over w - bytes_tx as bytes_diff,
    lead(last_reset) over w as lead_last_reset,
    last_reset
  from 
    table
  window w as (partition by t_id order by update_time)
)
where lead_last_reset = last_reset;

NB "lead"ing by 1 is so common that it's the default and can be left out.

NB2 All of your window functions are operating over the same window. So I made that explicit by defining the window in its own clause.