Separate the data from the processing. A database contains data; application code does the processing.
In your situation there are effectively two sets of data:
- Raw data (especially for recurring events, such as "4pm every 3rd tues of the month until next June")
- Calendar events, such as the the dozens of entries that the above item implies.
Consider starting with two tables, one for each of those. Then write the messy code that translates the definition of a recurring event into the individual entries. (Note: That is code, not SQL.)
Obviously, if there is the option to change a recurring event, there needs to be a way to make potentially massive changes to the second table. This implies an id for the recurring event that is stored in the second table. A FOREIGN KEY might be the way to go. (I prefer to do such myself.)
There are probably other tables, such as for 'normalizing' users, etc. (Do not normalize dates or datetimes.)
"What's scheduled for this afternoon?" That turns into a range query.
Hint: If something is ongoing (eg vacationing for 2 weeks), break it into multiple daily events (14, midnight to midnight) for the second table. Otherwise the SQL can get messy and inefficient.
How does all that feel?
I think in this case, you best best since you are on SQL 2012 is to use Window Functions and specifically the LEAD function.
Here is an example to try and match your data.
Say I created a table with the columns you have here in the following way and then insert rows into that:
CREATE TABLE #AnimalBehavior
(
ID int identity(1,1) primary key clustered
,AnimalID int
,DateID int
,Behavior1 int
,Behavior2 int
,Behavior3 int
);
GO
insert into #AnimalBehavior (AnimalID, DateID, Behavior1, Behavior2, Behavior3)
Values (1, 20150101, 0, 1, 1)
,(1, 20150201, 0, 1, 1)
,(1, 20150301, 0, 1, 1)
,(1, 20150401, 0, 1, 1)
,(2, 20150101, 0, 1, 1)
,(2, 20150301, 0, 1, 1)
,(2, 20150501, 0, 1, 1);
GO
From here I will want to only look at the times where your behavior3 is greater than 0 (so that we know it occurred during that period) and take the difference between the LEAD of the 2nd compared to the initial row, and the LEAD of the 3rd compared to the LEAD of the 2nd. You can do that in this way:
With ab
as
(
Select AnimalID
, DateID
, LEAD(DateID, 1, 0) over (Partition By AnimalID order by DateID) DateID_1
, Lead(DateID, 2, 0) over (Partition By AnimalID order by DateID) DateID_2
, row_number() over (Partition By AnimalID order by DateID) as rown
from #AnimalBehavior
where Behavior3 > 0
)
Select AnimalID, Convert(date, Convert(varchar(8), DateID))
, DateDiff(dd, Convert(date, Convert(varchar(8), DateID)), Convert(date, Convert(varchar(8), DateID_1)))
, DateDiff(dd, Convert(date, Convert(varchar(8), DateID_1)), Convert(date, Convert(varchar(8), DateID_2)))
from AB
where rown = 1;
I believe this covers what you are looking for. But feel free to ask questions if not.
Best Answer
SUGGESTED QUERY
SAMPLE DATA
Data would look like this
SUGGESTED QUERY EXECUTED
Why is the Query designed this way ??? Look at the first subquery
I performed a
LEFT JOIN
of this to the counts for a reason. Notice that the dataset has nothing in the50 - 59
range. Its count would not show up in the second query:I am sure you wanted the range
50 - 59
to show up, so the query is designd to catch all ranges. Any missing range is essentially defaulted to zero.GIVE IT A TRY !!!
CAVEAT: Unfortunately, the first subquery requires you to hardcode it. It will be the only hardcoding required to make it work.