Sql-server – SELECT TOP 1 DATEDIFF for each hour group

greatest-n-per-groupsql server

Does anybody know how could I select only the first entry for each hour and day from the example below?

Table

Created  ¦ Updated ¦ DATEDIFF(minute, Created, Updated)##
2016-11-11 18:00:49.590 ¦ 2016-11-11 22:57:35.153 ¦ 297
2016-11-11 18:00:49.590 ¦ 2016-11-11 22:57:34.897 ¦ 297
2016-11-11 18:01:49.433 ¦ 2016-11-11 22:57:34.467 ¦ 297
2016-11-11 18:01:49.433 ¦ 2016-11-11 22:57:33.397 ¦ 297
2016-11-11 18:02:49.373 ¦ 2016-11-11 22:57:33.127 ¦ 297
2016-11-11 18:02:49.373 ¦ 2016-11-11 22:57:33.457 ¦ 297
...
2016-11-11 17:00:10.833 ¦ 2016-11-11 20:35:31.020 ¦ 275
2016-11-11 17:00:10.833 ¦ 2016-11-11 20:35:31.210 ¦ 275
2016-11-11 17:01:10.723 ¦ 2016-11-11 20:35:30.807 ¦ 275
2016-11-11 17:01:10.723 ¦ 2016-11-11 20:35:30.527 ¦ 275
2016-11-11 17:02:10.180 ¦ 2016-11-11 20:35:30.293 ¦ 275
2016-11-11 17:02:10.180 ¦ 2016-11-11 20:35:30.423 ¦ 275
...
2016-11-11 16:00:10.473 ¦ 2016-11-11 20:17:31.020 ¦ 257
2016-11-11 16:00:10.473 ¦ 2016-11-11 20:17:31.210 ¦ 257
2016-11-11 16:01:10.754 ¦ 2016-11-11 20:17:30.807 ¦ 257
2016-11-11 16:01:10.754 ¦ 2016-11-11 20:17:30.527 ¦ 257
2016-11-11 16:02:10.342 ¦ 2016-11-11 20:17:30.293 ¦ 257
2016-11-11 16:02:10.342 ¦ 2016-11-11 20:17:30.423 ¦ 257

Data expected

2016-11-11 18:00 ¦ 297
2016-11-11 17:00 ¦ 275
2016-11-11 16:00 ¦ 257

Best Answer

Assuming this is SQL Server, you can truncate datetime values to hours using this technique:

DATEADD(HOUR, DATEDIFF(HOUR, 0, @YourDateTimeValue), 0)

Using that as a grouping criterion, you can then use the FIRST_VALUE analytic function (assuming you are on SQL Server 2012 or later version) to get the first DATEDIFF(MINUTE, Created, Updated) per group:

SELECT DISTINCT
  x.CreatedHour,
  DiffMinutes = FIRST_VALUE(DATEDIFF(MINUTE, t.Created, t.Updated))
                OVER (PARTITION BY x.CreatedHour
                      ORDER BY t.Created ASC
                      ROWS UNBOUNDED PRECEDING)
FROM
  dbo.YourTable AS t
  CROSS APPLY
  (
    SELECT DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Created), 0)
  ) AS x (CreatedHour)
;

Since the FIRST_VALUE function would return the result for every row of your table, it would return identical values for entries created in the same hour, thus giving you a lot of duplicates. It is to suppress those duplicates that the above query is using DISTINCT.