I would use a Calendar
table. This table simply has a list of dates for several decades.
CREATE TABLE [dbo].[Calendar](
[dt] [date] NOT NULL,
CONSTRAINT [PK_Calendar] PRIMARY KEY CLUSTERED
(
[dt] ASC
))
In my system it has few extra columns, such as [IsLastDayOfMonth]
, [IsLastDayOfQuarter]
, which are useful in some reports, but in your case you need just the date column. There are many ways to populate such table.
For example, 100K rows (~270 years) from 1900-01-01:
INSERT INTO dbo.Calendar (dt)
SELECT TOP (100000)
DATEADD(day, ROW_NUMBER() OVER (ORDER BY s1.[object_id])-1, '19000101') AS dt
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);
Sample data
DECLARE @Stocks TABLE (
StockId int
, ProductId int
, AvailableFromDate date);
INSERT INTO @Stocks(StockId, ProductId, AvailableFromDate) VALUES
(1, 1, '2016-01-01'),
(2, 1, '2016-01-01');
DECLARE @RentStockOrders TABLE (
RentStockOrderId int
, StockId int
, BeginRentDate date
, EndRentDate date);
INSERT INTO @RentStockOrders (RentStockOrderId, StockId, BeginRentDate, EndRentDate) VALUES
(1, 1, '2016-01-15', '2016-02-14'),
(2, 2, '2016-01-30', '2016-02-20'),
(3, 2, '2016-02-26', '2016-03-07'),
(4, 1, '2016-02-29', '2016-03-14');
Parameters
DECLARE @ParamProductID int = 1;
DECLARE @ParamDays int = 14;
DECLARE @ParamStartDate date = '2015-12-01';
DECLARE @ParamEndDate date = '2017-01-01';
-- these dates define some reasonable limit
First variant
It turns out that window functions accept only literal constant as a size of a window, not a variable. Alas. Still, I'll show this query, because it illustrates the approach and shows how simple it would be if SQL Server supported variables as the size of the window. It also gives the correct answer that we can use to verify the second variant.
WITH
CTE_AllDays
-- list all days and stock IDs between @ParamStartDate and @ParamEndDate
-- for each day and stock indicate whether it is available based on AvailableFromDate
AS
(
SELECT
S.StockId
,dbo.Calendar.dt
,CASE WHEN dbo.Calendar.dt >= S.AvailableFromDate
THEN 1 ELSE 0 END AS AvailableStockDay
-- 1 - available
-- 0 - not available
FROM
@Stocks AS S
INNER JOIN dbo.Calendar ON
dbo.Calendar.dt >= @ParamStartDate
AND dbo.Calendar.dt <= @ParamEndDate
WHERE
S.ProductId = @ParamProductID
)
,CTE_BookedDays
-- list all booked (unavailable) days for each stock ID
AS
(
SELECT
S.StockId
,CA.dt
,0 AS AvailableStockDay
-- 0 - not available
FROM
@RentStockOrders AS R
INNER JOIN @Stocks AS S ON S.StockId = R.StockId
CROSS APPLY
(
SELECT dbo.Calendar.dt
FROM dbo.Calendar
WHERE
dbo.Calendar.dt >= @ParamStartDate
AND dbo.Calendar.dt <= @ParamEndDate
AND dbo.Calendar.dt >= R.BeginRentDate
AND dbo.Calendar.dt <= R.EndRentDate
) AS CA
WHERE
S.ProductId = @ParamProductID
)
,CTE_Daily
-- combine individual availability flags
-- first: multiply to get the final availability for a stock ID and day
-- second: group further by day and SUM flags
AS
(
SELECT
CTE_AllDays.dt
,CASE WHEN
SUM(
CTE_AllDays.AvailableStockDay * ISNULL(CTE_BookedDays.AvailableStockDay, 1)
) = 0
THEN 0 ELSE 1 END AS AvailableDay
-- day is available, if any stock is available
-- SUM=0 - not available
-- SUM>0 - available
FROM
CTE_AllDays
LEFT JOIN CTE_BookedDays ON
CTE_BookedDays.StockId = CTE_AllDays.StockId AND
CTE_BookedDays.dt = CTE_AllDays.dt
GROUP BY
CTE_AllDays.dt
)
,CTE_Sum
-- rolling sum of flags with 14 days window
AS
(
SELECT
dt
,SUM(CTE_Daily.AvailableDay) OVER (ORDER BY CTE_Daily.dt
ROWS BETWEEN CURRENT ROW AND 13 FOLLOWING) AS AvailableConsecutive
-- we can't put @ParamDays here instead of constant
FROM CTE_Daily
)
-- If a rolling sum = 14,
-- it means that all 14 consecutive days are available
-- If a rolling sum <> 14,
-- it means that at least one of the 14 consecutive days is not available
SELECT dt
FROM CTE_Sum
WHERE AvailableConsecutive <> 14
-- we can put @ParamDays here instead of 14, but not above
ORDER BY dt;
Result
Note, I started the range of dates from 2015-12-01
, before the general availability.
+------------+
| dt |
+------------+
| 2015-12-01 |
| 2015-12-02 |
| 2015-12-03 |
| 2015-12-04 |
| 2015-12-05 |
| 2015-12-06 |
| 2015-12-07 |
| 2015-12-08 |
| 2015-12-09 |
| 2015-12-10 |
| 2015-12-11 |
| 2015-12-12 |
| 2015-12-13 |
| 2015-12-14 |
| 2015-12-15 |
| 2015-12-16 |
| 2015-12-17 |
| 2015-12-18 |
| 2015-12-19 |
| 2015-12-20 |
| 2015-12-21 |
| 2015-12-22 |
| 2015-12-23 |
| 2015-12-24 |
| 2015-12-25 |
| 2015-12-26 |
| 2015-12-27 |
| 2015-12-28 |
| 2015-12-29 |
| 2015-12-30 |
| 2015-12-31 |
| 2016-01-17 |
| 2016-01-18 |
| 2016-01-19 |
| 2016-01-20 |
| 2016-01-21 |
| 2016-01-22 |
| 2016-01-23 |
| 2016-01-24 |
| 2016-01-25 |
| 2016-01-26 |
| 2016-01-27 |
| 2016-01-28 |
| 2016-01-29 |
| 2016-01-30 |
| 2016-01-31 |
| 2016-02-01 |
| 2016-02-02 |
| 2016-02-03 |
| 2016-02-04 |
| 2016-02-05 |
| 2016-02-06 |
| 2016-02-07 |
| 2016-02-08 |
| 2016-02-09 |
| 2016-02-10 |
| 2016-02-11 |
| 2016-02-12 |
| 2016-02-13 |
| 2016-02-14 |
| 2016-02-16 |
| 2016-02-17 |
| 2016-02-18 |
| 2016-02-19 |
| 2016-02-20 |
| 2016-02-21 |
| 2016-02-22 |
| 2016-02-23 |
| 2016-02-24 |
| 2016-02-25 |
| 2016-02-26 |
| 2016-02-27 |
| 2016-02-28 |
| 2016-02-29 |
| 2016-03-01 |
| 2016-03-02 |
| 2016-03-03 |
| 2016-03-04 |
| 2016-03-05 |
| 2016-03-06 |
| 2016-03-07 |
| 2016-12-20 |
| 2016-12-21 |
| 2016-12-22 |
| 2016-12-23 |
| 2016-12-24 |
| 2016-12-25 |
| 2016-12-26 |
| 2016-12-27 |
| 2016-12-28 |
| 2016-12-29 |
| 2016-12-30 |
| 2016-12-31 |
| 2017-01-01 |
+------------+
Second variant
The first part of the query until CTE_Daily
is the same. Then, I'll use a gaps-and-islands approach to find the islands of available dates and calculate their sizes.
WITH
CTE_AllDays
-- list all days and stock IDs between @ParamStartDate and @ParamEndDate
-- for each day and stock indicate whether it is available based on AvailableFromDate
AS
(
SELECT
S.StockId
,dbo.Calendar.dt
,CASE WHEN dbo.Calendar.dt >= S.AvailableFromDate
THEN 1 ELSE 0 END AS AvailableStockDay
-- 1 - available
-- 0 - not available
FROM
@Stocks AS S
INNER JOIN dbo.Calendar ON
dbo.Calendar.dt >= @ParamStartDate
AND dbo.Calendar.dt <= @ParamEndDate
WHERE
S.ProductId = @ParamProductID
)
,CTE_BookedDays
-- list all booked (unavailable) days for each stock ID
AS
(
SELECT
S.StockId
,CA.dt
,0 AS AvailableStockDay
-- 0 - not available
FROM
@RentStockOrders AS R
INNER JOIN @Stocks AS S ON S.StockId = R.StockId
CROSS APPLY
(
SELECT dbo.Calendar.dt
FROM dbo.Calendar
WHERE
dbo.Calendar.dt >= @ParamStartDate
AND dbo.Calendar.dt <= @ParamEndDate
AND dbo.Calendar.dt >= R.BeginRentDate
AND dbo.Calendar.dt <= R.EndRentDate
) AS CA
WHERE
S.ProductId = @ParamProductID
)
,CTE_Daily
-- combine individual availability flags
-- first: multiply to get the final availability for a stock ID and day
-- second: group further by day and SUM flags
AS
(
SELECT
CTE_AllDays.dt
,CASE WHEN
SUM(
CTE_AllDays.AvailableStockDay * ISNULL(CTE_BookedDays.AvailableStockDay, 1)
) = 0
THEN 0 ELSE 1 END AS AvailableDay
-- day is available, if any stock is available
-- SUM=0 - not available
-- SUM>0 - available
FROM
CTE_AllDays
LEFT JOIN CTE_BookedDays ON
CTE_BookedDays.StockId = CTE_AllDays.StockId AND
CTE_BookedDays.dt = CTE_AllDays.dt
GROUP BY
CTE_AllDays.dt
)
,CTE_RowNumbers
-- calculate two sets of row numbers to isolate consecutive rows with 0s and 1s
-- (gaps and islands)
AS
(
SELECT
dt
,AvailableDay
,ROW_NUMBER() OVER (ORDER BY dt) AS rn1
,ROW_NUMBER() OVER (PARTITION BY AvailableDay ORDER BY dt) AS rn2
FROM CTE_Daily
)
,CTE_Groups
-- each gaps and island will have the same GroupNumber
-- count the size of the Group
-- number the rows within each Group to find if @ParamDays rows fit into the Group
AS
(
SELECT
dt
,AvailableDay
,GroupNumber
,COUNT(*) OVER (PARTITION BY GroupNumber) AS GroupSize
,ROW_NUMBER() OVER (PARTITION BY GroupNumber ORDER BY dt) AS GroupRN
FROM
CTE_RowNumbers
CROSS APPLY (SELECT rn1 - rn2 AS GroupNumber) AS CA
)
SELECT
dt
--,CASE WHEN AvailableDay = 1 AND GroupSize - GroupRN + 1 >= @ParamDays
--THEN 1 ELSE 0 END AS AvailableConsecutive
FROM CTE_Groups
WHERE
CASE WHEN AvailableDay = 1 AND GroupSize - GroupRN + 1 >= @ParamDays
THEN 1 ELSE 0 END = 0
-- AvailableConsecutive = 0 to list all unavailable days
ORDER BY dt;
The result is the same as in the first variant, but this variant uses parameter @ParamDays
.
To understand how it works, run the query starting with the first CTE, examine the results, then add the next CTE, examine the results and so on.
The date
type can't store partial dates, like only year, or only day and month.
In one project I had a need to store incomplete dates and I used three separate nullable columns for day (tinyint)
, month (tinyint)
and year (smallint)
. I don't see in what cases you'd need to use varchar
instead of integer types.
In my case it was information about customer birth dates and quite often salespeople didn't know the year, but knew month or day. With this approach it is very flexible and allows the user to enter any bit of information he has. It was also important to know that certain part of the date is missing, such as "we don't know the exact birth date, but we know that it is in June".
Obviously, there were a bunch of custom checks in the application (not at the database level) trying to prevent entering things like 31st of June. In that particular project it was OK to perform the checks in the application. Most likely it would have been possible to implement them in T-SQL as well. In general, it is better to have constraints implemented at the database level, because applications come and go, but data remains.
It was more difficult to use these incomplete dates in queries as well. Different queries/reports required different approaches to dealing with the missing bits.
30K rows is not much at all and performance should not be a problem. So, I'd focus on smart parsing and validation of your Excel data and storing all information that you can extract.
Best Answer
Check out my answer to the following question on StackOverflow
https://stackoverflow.com/questions/3093924/select-all-months-within-given-date-span-including-the-ones-with-0-values/3111441#3111441
You could easily apply this method to your problem, and return data for all the days between a range, including those with no data content.
Dave