Sql-server – Calculate average days between orders

sql serversql server 2014t-sql

I would like to calculate the average days between several order dates from a table called Orders. For each CustomerID, what is the average days between orders. The sample table is as below (img):

CREATE TABLE #Orders(CustomerID int, OrderDate datetime);

INSERT #Orders(CustomerID, OrderDate) VALUES
    (100,'20170114'),(100,'20170123'),(100,'20170129'),
    (101,'20170202'),(101,'20170212');

I tried this query:

SELECT  CustomerID, AVG(OrderDate - PriorDate)
FROM (SELECT CustomerID, OrderDate
           , LAG(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate) as PriorDate
      FROM #Orders where CustomerID = 100)

However, this yields:

Msg 102, Level 15, State 1

Incorrect syntax near ')'.

Best Answer

Several issues:

  • You are trying to take an average per customer, but you're not grouping by customer.
  • Implicit math for datetimes (OrderDate-PriorDate) is not a good idea (try that with date or datetime2) - use DATEDIFF.
  • Integer division will not give you an average like you think - try SELECT 5/2;. You need to convert at least one input to a decimal, either implicitly (*1.0) or explicitly (CONVERT(), TRY_CONVERT(), CAST(), etc.). Explicit allows you to control decimal places in some cases.
  • LAG() was not complete - while the default is 1, I think it is good to be explicit that you want the previous row.
  • The incorrect syntax error is actually coming from the end of the query - when you have something like SELECT ... FROM (<subquery>), that subquery needs to be named, so you need to use something like SELECT ... FROM (<subquery>) AS x; for example.

Try the following:

SELECT 
  CustomerID, 
  AvgLag = AVG(CONVERT(decimal(7,2), DATEDIFF(DAY, PriorDate, OrderDate)))
FROM
(
  SELECT CustomerID, OrderDate, PriorDate = LAG(OrderDate,1) 
    OVER (PARTITION BY CustomerID ORDER BY OrderDate)
  FROM #Orders
  WHERE CustomerID = 100
) AS lagged
GROUP BY CustomerID;

Results:

CustomerID    AvgLag
----------    ------
100           7.50

Of course, if you want the average for all customers, just leave out the WHERE clause. But if you really only ever want a single customer, you don't really need the customer in the output, so you can adjust the query a little bit to get rid of the GROUP BY (let's parameterize the customer ID while we're at it):

DECLARE @CustomerID int = 100;

SELECT AvgLag = AVG(Lag), CustomerID = @CustomerID -- you may not need this
FROM 
(
  SELECT Lag = CONVERT(decimal(7,2), DATEDIFF(DAY, LAG(OrderDate,1) 
    OVER (PARTITION BY CustomerID ORDER BY OrderDate), OrderDate))
  FROM #Orders
  WHERE CustomerID = @CustomerID
) AS Lagged;