Given the following data:
create table #histories
(
username varchar(10),
account varchar(10),
assigned date
);
insert into #histories
values
('PHIL','ACCOUNT1','2017-01-04'),
('PETER','ACCOUNT1','2017-01-15'),
('DAVE','ACCOUNT1','2017-03-04'),
('ANDY','ACCOUNT1','2017-05-06'),
('DAVE','ACCOUNT1','2017-05-07'),
('FRED','ACCOUNT1','2017-05-08'),
('JAMES','ACCOUNT1','2017-08-05'),
('DAVE','ACCOUNT2','2017-01-02'),
('PHIL','ACCOUNT2','2017-01-18'),
('JOSH','ACCOUNT2','2017-04-08'),
('JAMES','ACCOUNT2','2017-04-09'),
('DAVE','ACCOUNT2','2017-05-06'),
('PHIL','ACCOUNT2','2017-05-07') ;
… which represents when a given user was assigned to an account.
I am looking to establish who owned a given account on the last day of each month (the assigned date is the date that the account transferred ownership), with any missing month-ends populated (possibly created from a handy dates
table that I have available, with useful columns DateKey
, Date
and LastDayOfMonth
, [courtesy of @AaronBertrand])1.
The desired results would be:
PETER, ACCOUNT1, 2017-01-31
PETER, ACCOUNT1, 2017-02-28
DAVE, ACCOUNT1, 2017-03-31
DAVE, ACCOUNT1, 2017-04-30
FRED, ACCOUNT1, 2017-05-31
FRED, ACCOUNT1, 2017-06-30
FRED, ACCOUNT1, 2017-07-31
JAMES, ACCOUNT1, 2017-08-31
PHIL, ACCOUNT2, 2017-01-31
PHIL, ACCOUNT2, 2017-02-28
PHIL, ACCOUNT2, 2017-03-31
JAMES, ACCOUNT2, 2017-04-30
PHIL, ACCOUNT2, 2017-05-31
Doing the initial part of this with a windowing function is trivial, it's adding the "missing" rows that I'm struggling with.
Best Answer
One approach to this problem is to do the following:
LEAD
on SQL Server 2008. You can useAPPLY
or a suquery for this.I modified your test data a little bit to make the results deterministic. Also added an index:
Here's the laziest date dimension table of all time:
For step 1, there are plenty of ways to emulate
LEAD
. Here's one method:For step 2, we need to change the NULL values to something else. You want to include the final month for each account, so adding one month to the starting date suffices:
For step 3, we can join to the date dimension table. The column from the dimension table is exactly the column you need for the result set:
I didn't like the query that I got when I put it all together. There can be issues with join order when combining
OUTER APPLY
andINNER JOIN
. To get the join order I wanted I rewrote it with a subquery:I don't know how much data you have so it might not matter for you. But the plan looks how I want it to:
The results match yours: