Sql-server – SQL to choose a price between two points in time

running-totalssql servertime

Given a time-series of datetime price changes and another disparate time-series of N events, how do I query the last known price at the time of each event without running N queries? Essentially, I need to write some kind of CROSS-JOIN BETWEEN query that matches the most recent known price.

For example, price change history:

Changed At (time) | Price (money)
                1 |    10
                5 |    20
               10 |    30
               20 |    40

Events:

Event Time | Nearest Matched Price (from above)
         0 | n/a
         3 | 10
         6 | 20
        10 | 30
        15 | 30

The real-world use-case is that I tracked a time series of stock movements with costs and prices, but neglected to store costs alongside my invoice lines. My stock movements usually occurred right before the time of invoice.

I have a similar calendaring report query that reports sales across calendar days/weeks/months, but only works because the CROSS JOIN condition (> now && < next-day) will only ever return one row. Pardon the noisy query (will clean it up):

SELECT [t3].[FirstDateOfWeek] AS [Date], [t3].[value] AS [Total], [t3].[value2] AS [Count]
FROM (
    SELECT SUM([t0].[Total]) AS [value], COUNT(*) AS [value2], [t2].[FirstDateOfWeek]
    FROM [dbo].[vw_Invoices] AS [t0]
    LEFT OUTER JOIN [dbo].[Cases] AS [t1] ON ([t1].[CaseId]) = [t0].[CaseId]
    CROSS JOIN [dbo].[Calendar] AS [t2]
    WHERE ([t0].[Date] >= [t2].[CalendarDate]) AND ([t0].[Date] < [t2].[NextDayDateTime]) AND ([t0].[Date] >= @p0) AND ([t0].[Date] < @p1) AND (NOT ([t0].[IsVoided] = 1))
    GROUP BY [t2].[FirstDateOfWeek]
    ) AS [t3]
ORDER BY [t3].[FirstDateOfWeek]  

Is this even possible in SQL, perhaps with a "running total" query grouped by extracted hour?

Best Answer

Quick-and-dirty, not tested on a live instance.

For each row of Event data we need one row from Price - the one which happened most recently, but before the Event's timestamp. TSQL supports the top 1 .. order by notation. By embedding the Price lookup as a sub-query in a SELECT list it will be executed once per row in Event. Predicating Price on Event's values will ensure the most recent is returned. Something like this:

select
    e.event_id,
    e.event_time,
    ( select top 1     -- return one value
        p.price
      from Prices as p
      where p.price_time <= e.event_time -- ensure the price change happened at or before the event
      order by p.price_time desc  -- top ensure "top 1" picks the price with the gretest i.e. most recent, timestamp
    ) as price
from Events as e;

The sub-query will run once per row in Events, so performance may be compromised for very large sets. Make sure there are indexes on Price.price_time.