Postgresql – Join 2 tables by closest time, PostgreSQL 9.6

join;postgresqltime

I have 2 tables: tbl1, tbl2.

CREATE TABLE tbl1(time_1)
AS VALUES
  ( '2017-09-06 15:26:03'::timestamp ),
  ( '2017-09-06 15:26:02' ),
  ( '2017-09-06 15:28:01' ),
  ( '2017-09-06 15:40:00' );

CREATE TABLE tbl2(time_2)
AS VALUES
  ( '2017-09-06 15:29:01'::timestamp ),
  ( '2017-09-06 15:40:00' ),
  ( '2017-09-06 15:23:59' ),
  ( '2017-09-06 15:45:58' );

I want to join the table, thus for every row in tbl1 match closest time in tbl2. The output is :

     time_1                     time_2
---------------------      --------------------
 2017-09-06 15:26:03      2017-09-06 15:23:59      
 2017-09-06 15:26:02      2017-09-06 15:23:59 
 2017-09-06 15:28:01      2017-09-06 15:29:01
 2017-09-06 15:40:00      2017-09-06 15:45:58

I know how to find single value for closest time:

SELECT * from tbl1 where time_1=INPUT_TIME ORDER BY case when time_1 > INPUT_TIME then time_1 - INPUT_TIME else INPUT_TIME - time_1 end limit 1;

Best Answer

Using btree_gist and <->

You really want <-> which will actually work on GiST indexes through the internal tstz_dist. This works because you don't really care about the interval. First you need to add the extension,

CREATE EXTENSION btree_gist;

then

SELECT DISTINCT ON (time_1) time_1, time_2
FROM tbl1
CROSS JOIN tbl2
ORDER BY time_1, time_1 <-> time_2;

If there are too many rows to do a cross-join, you may have better luck with the lateral solution.

SELECT time_1, time_2
FROM tbl1
CROSS JOIN LATERAL (
  SELECT time_2
  FROM tbl2
  ORDER BY time_1<->time_2
  LIMIT 1
) AS tbl2;

Would be interested in seeing a report back on the benchmarks. =) Especially if you have a GIST index on time_1 and time_2.