Sql-server – Joining tables with all rows for each unique column value

join;sql server

I have 2 tables as follows:

    | A        |
    +----------+
    | ty_date  |
    | ly_date  |
    +----------+

and

    | B           |
    +-------------+
    | locationid  |
    | ty_date     |
    | value       |
    +-------------+

Let's say Table A has ty_date from 1st January till 31st December while Table B has ty_date from 1st January till, say, 1st August for each locationid.

I am trying to join the table in such a way that in Table B each locationid has all the dates from Table A with the field value having null values for all future dates.

I tried FULL OUTER JOIN ON A.ty_date = B.ty_date but I am still missing the grouping logic for each locationid

Any help on what approach I can try?

Best Answer

You can use a recursive CTE to produce the complete list of days for each locationid then join it back to table_b to get your values

Setup the tables:

CREATE TABLE dbo.table_a (
    ty_date DATE NULL,
    ly_date DATE NULL
);

CREATE TABLE dbo.table_b (
    locationid INT NULL,
    ty_date DATE NULL,
    value INT NULL
);

Add 365 entries to table_a, one for each day:

;WITH Dates AS (
    SELECT 1 AS [RowID], CAST('2019-01-01' AS DATETIME) AS ty_date
    UNION ALL
    SELECT r1.RowID + 1 AS RowID, DATEADD(DAY, 1, r1.ty_date) AS ty_date
    FROM Dates r1
    WHERE r1.RowID <= 364
)

INSERT INTO dbo.table_a(ty_date)
SELECT ty_date
FROM Dates
OPTION (MAXRECURSION 400)

Add entries in table_b for 3 unique locationid's. One entry for each day until Aug:

;WITH Dates AS (
    SELECT 1 AS [RowID], CAST('2019-01-01' AS DATETIME) AS ty_date
    UNION ALL
    SELECT r1.RowID + 1 AS RowID, DATEADD(DAY, 1, r1.ty_date) AS ty_date
    FROM Dates r1
    WHERE r1.RowID <= 242
)

INSERT INTO dbo.table_b(locationid, ty_date, value)
SELECT 1,
    ty_date, 
    RowID
FROM Dates
UNION
SELECT 2,
    ty_date, 
    RowID
FROM Dates
UNION
SELECT 3,
    ty_date, 
    RowID
FROM Dates
OPTION (MAXRECURSION 400)

Use the below query to return the full 365 days, one row per day per locationid. The Value column is only filled for values that exist in table_b, otherwise they are NULL

DECLARE @MinDate DATE,
    @MaxDate DATE

SELECT @MinDate = MIN(ty_date), @MaxDate = DATEADD(DAY, -1, MAX(ty_date)) FROM table_a

;WITH AllData AS (
    SELECT DISTINCT 1 AS RowID, locationid, @MinDate AS ty_date
    FROM table_b
    UNION ALL
    SELECT r1.RowID + 1 AS RowID, r1.locationid, DATEADD(DAY, 1, r1.ty_date) AS ty_date
    FROM AllData r1
    WHERE r1.ty_date BETWEEN @MinDate AND @MaxDate
)

SELECT ad.ty_date, ad.locationid, b.value
FROM AllData ad
LEFT JOIN table_b b ON b.locationid = ad.locationid AND b.ty_date = ad.ty_date
ORDER BY ad.ty_date, ad.locationid
OPTION (MAXRECURSION 1100)

Note that MAXRECURSION must be equal to or higher than 365 * the number of distinct locationid values in table_b minus 1

The results:

ty_date     locationid  value
-----------------------------
2019-01-01  1           1
2019-01-01  2           1
2019-01-01  3           1
2019-01-02  1           2
2019-01-02  2           2
2019-01-02  3           2
2019-01-03  1           3
2019-01-03  2           3
2019-01-03  3           3
2019-01-04  1           4
2019-01-04  2           4
2019-01-04  3           4
.....
2019-08-31  1           243
2019-08-31  2           243
2019-08-31  3           243
2019-09-01  1           NULL
2019-09-01  2           NULL
2019-09-01  3           NULL
2019-09-02  1           NULL
2019-09-02  2           NULL
2019-09-02  3           NULL
2019-09-03  1           NULL
2019-09-03  2           NULL
2019-09-03  3           NULL
.....
2019-12-28  1           NULL
2019-12-28  2           NULL
2019-12-28  3           NULL
2019-12-29  1           NULL
2019-12-29  2           NULL
2019-12-29  3           NULL
2019-12-30  1           NULL
2019-12-30  2           NULL
2019-12-30  3           NULL
2019-12-31  1           NULL
2019-12-31  2           NULL
2019-12-31  3           NULL

See here for a db<>fiddle demonstrating this.