Postgresql – how to extract difference between timestamps

pgadminpostgresqlquerysyntax

SQL greenhorn alarm:
I have a table with timestamp information (44 rows) and I now would like to calculate and extract the delta between any two consecutive rows. How can I do this? I select my data with a query like:

SELECT * from "table" where "RecognitionTimestamp" >= '2018-10-03 00:00:00' and "RecognitionTimestamp" <= '2018-10-04 9:00:00'

I'm using pgAdmin4 to submit the query and look at the data.

Best Answer

You can use Windows Functions which provides calculation across a set of table rows that are somehow related to the current row. The list of functions are in General-Purpose Window Functions section of the documentation and The lead() and lag() functions are answer of your question. Your query will should like this;

SELECT *,
  "RecognitionTimestamp" - lag("RecognitionTimestamp") OVER(ORDER BY "RecognitionTimestamp") AS delta_previous,
  lead("RecognitionTimestamp") OVER(ORDER BY "RecognitionTimestamp") - "RecognitionTimestamp" AS delta_next
FROM "table" 
WHERE
  "RecognitionTimestamp" >= '2018-10-03 00:00:00' AND 
  "RecognitionTimestamp" <= '2018-10-04 9:00:00'

Related Solutions

Postgresql – Most efficient way to retrieve data by timestamps

To check if an index is working, use SELECT COUNT(*) rather than SELECT *, otherwise you're not only measuring index performance but also the time it takes to transfer all the data across your client application, which in this case is a whooping 50,000 rows!

You can significantly simplify your query and index by date datatype instead of timestamp, which is smaller and therefore it should perform better:

CREATE INDEX foo_foo_date_idx ON foo
USING btree ((foo_date::date));

select count(*) from foo
where foo_date::date = CURRENT_DATE;

You could reduce that index even further by using a smallint calculated as the difference between a set date (e.g. 1/1/2010 +- 32768 days = Apr-1920 to Sep-2099):

CREATE INDEX foo_foo_date_idx ON foo
USING btree (((foo_date::date - '2010-01-01'::date)::int2));

select count(*) from foo
where (foo_date::date - '2010-01-01'::date)::int2 = ('today'-'2010-01-01'::date);

Mysql – Calculating duration between two events in a MySQL database (timeseries)

Approach 1 (works in any version)

Add a sequence number to the table (or a copy of the columns of interest).
Do a "self join" ON a.seq = b.seq-1 to match up adjacent rows.
Filter down to the status changes: WHERE a.Value != b.Value

The table will contain, for example:

  3  13:44:13  off   4  13:47:27   on
 12  14:27:27   on  13  14:34:48  off

Assuming you want the timespan between pairs of rows in this table, again add a sequence number, and do the self join to extract and subtract the next to last column.

Approach 2 (plug for upgrading)

If you have MariaDB 10.2 or MySQL 8.0, then there are "Windowing functions" that make this task immensely simpler. See LAG(); it obviates the need for the self joins and the sequencing.

In the first pass, look for LAG(Value) != Value while using ORDER BY DateTime. In the second pass, diff the LAG of the time with the current time.

Approach 3 (possibly the best for 5.5 without windowing)

Since you are stuck in the nearly-antique 5.5, Consider using a Stored Procedure, not a VIEW. It will involve two extra tables. They cannot be CREATE TEMPORARY TABLE since you can't use a temp table twice for the self-join.

So, a better approach (without windowing) would be to use @variables to simulate LAG().

Approach 4 (somewhere between 1 and 3)

This involves a single pass of looking at consecutive rows. If the first of a consecutive (based on datetime) pair of rows says ON, then calculate the "on time". Then SUM the "on times".

Details (Maybe I will give you details, but first...)

Provide sample data (not images) ready to run (CREATEs and INSERTs). Also provide expected output. (I understand ON to OFF, but what about OFF to ON; and do you want the start time of the range; etc.)

A hiccup

Since you want the "on time" to stop at the top of the hour, extra code is needed to stop the summation, etc. One artificial way to do this is to add 2 rows, one second apart at the top of each hour. The first is OFF, the second (1 second later) is ON. This will create a 1-second error in the results, but has no impact on the Approaches except for simplifying the "top of hour" requirement.

Best Answer

Related Solutions

Postgresql – Most efficient way to retrieve data by timestamps

Mysql – Calculating duration between two events in a MySQL database (timeseries)

Related Question