SQL Server – Create Employee Rows for Each Date in Range

sql server

I have a list of employees.
I need to create output which is one row per date in a date range for each employee. The date range will always be the last 15 days, so it is expected to have 15 rows for each employee. A "few thousand" employees. This is SQL Server.

SAMPLE SOURCE EMP TABLE:  
EMP1  
EMP2 


RESULT:  
EMP1, JAN 1  
EMP1, JAN 2  
EMP1, JAN 3  
EMP2, JAN 1  
EMP2, JAN 2  
EMP2, JAN 3

Best Answer

Here is an example of using recursion to build a common expression date table and then cross join to the employee table.

--demo setup
declare @EmpTable table (EmpName varchar(50))
insert into @EmpTable(EmpName) values('emp1'),('emp2')

--use recursion to build a common table expression date table
;WITH dates ([Date])
AS (
    SELECT convert(DATE, '2019-01-01') AS [Date] -- Put the start date here

    UNION ALL

    SELECT dateadd(day, 1, [Date])
    FROM dates
    WHERE [Date] < '2019-01-15' -- Put the end date here 
    )
SELECT e.empname
    ,substring(DATENAME(month, ([date])), 1, 3) + ' ' + DATENAME(day, ([date])) AS [DateDescription]
FROM dates d
CROSS JOIN @EmpTable e
ORDER BY empname, [date]
OPTION (MAXRECURSION 32767) -- Don't forget to use the maxrecursion option!

| empname | DateDescription |
|---------|-----------------|
| emp1    | Jan 1           |
| emp1    | Jan 2           |
| emp1    | Jan 3           |
| emp1    | Jan 4           |
| emp1    | Jan 5           |
| emp1    | Jan 6           |
| emp1    | Jan 7           |
| emp1    | Jan 8           |
| emp1    | Jan 9           |
| emp1    | Jan 10          |
| emp1    | Jan 11          |
| emp1    | Jan 12          |
| emp1    | Jan 13          |
| emp1    | Jan 14          |
| emp1    | Jan 15          |
| emp2    | Jan 1           |
| emp2    | Jan 2           |
| emp2    | Jan 3           |
| emp2    | Jan 4           |
| emp2    | Jan 5           |
| emp2    | Jan 6           |
| emp2    | Jan 7           |
| emp2    | Jan 8           |
| emp2    | Jan 9           |
| emp2    | Jan 10          |
| emp2    | Jan 11          |
| emp2    | Jan 12          |
| emp2    | Jan 13          |
| emp2    | Jan 14          |
| emp2    | Jan 15          |

Related Solutions

Sql-server – the best way to get all data for a date range, plus the last event just before the range

I am going to assume that there isn't an index on the date columns, otherwise I think that the query would have been structured differently. If there is, you can probably find a better performing one than this.

The advantage of this query is that it can get all the data in one scan. The disadvantage is that it has to sort the data and join EventEmployee on the entire table. So as always, test with your own situation. This query also assumes that the MAX date is either unique or that equivalent rows would be acceptable.

USE AdventureWorks2012
GO
;
WITH Base AS (
   SELECT 
      TransactionHistory.*
      ,ProductVendor.BusinessEntityID
      ,MAX(CASE WHEN TransactionDate < '2008-08-01' THEN TransactionDate END) 
           OVER (PARTITION BY ProductVendor.BusinessEntityID) AS PreviousVendorTransaction
      ,COUNT(CASE WHEN TransactionDate >= '2008-08-01' THEN 1 END ) 
           OVER (PARTITION BY ProductVendor.BusinessEntityID) AS VendorAfterCutoff
   FROM
      Production.TransactionHistory
      -- Doesn't make the most sense, but I need a repeating relation
      INNER JOIN Purchasing.ProductVendor
         ON TransactionHistory.ProductID = ProductVendor.ProductID
),
Filtered AS (
   SELECT
      *
   FROM
      Base
   WHERE
      Base.TransactionDate >= '2008-08-01'
      OR (TransactionDate = PreviousVendorTransaction AND VendorAfterCutoff > 0)
)
SELECT DISTINCT
   TransactionID
   ,ProductID
   ,ReferenceOrderID
   ,ReferenceOrderLineID
   ,TransactionDate
   ,TransactionType
   ,Quantity
   ,ActualCost
   ,ModifiedDate
FROM
   Filtered

Edit:

Hmm, I think I may have to take back my comment on structuring it differently if there are indexes. The other suggestions that I have are probably fairly minor.

Make sure the query is using the indexes you're expecting it to. Start and End date to build temp table, end date to drive the previous event loop.
If the query to build the temp table is doing a lookup on the clustered index, it may be better to hold off and do that as part of the main query.
Try using a cte instead of a temp table. I think that a cte might be more competitive with the way that the query is structured below.
If you are returning a lot of events, it might be better to pull out the event table lookup to the main query to give the optimizer the option of doing a merge join.
I don't see a way of optimizing the previous event lookup short of an indexed view.

Here's a query that combines a few of those ideas.

SELECT
    e.[EventID]
INTO #EventTemp
FROM
    [Events] AS e
WHERE
    ( e.[EventStart] >= @StartDate AND e.[EventStart] <= @EndDate )
    OR ( e.[EventEnd] >= @StartDate AND e.[EventEnd] <= @EndDate )

;
WITH PrevEvent AS (
    SELECT
        EmpPrevEvent.[EventID]
    FROM
    (
        SELECT DISTINCT
            ee.[EmployeeID]
        FROM
            #EventTemp
            INNER JOIN [EventEmployee] AS ee ON
                #EventTemp.[EventID] = ee.[EventID]
    ) AS Emp
    CROSS APPLY (
        SELECT TOP 1
            e.[EventID]
        FROM
            [Events] AS e
            INNER JOIN [EventEmployee] AS ee ON
                e.[EventID] = ee.[EventID]
        WHERE
            ee.[EmployeeID] = Emp.[EmployeeID]
            AND e.[EventEnd] < @StartDate
        ORDER BY 
            e.[EventEnd] DESC
    ) AS EmpPrevEvent
)
SELECT
    e.[EventID],
    e.[EventStart],
    e.[EventEnd],
    e.[EventTypeID]
FROM
    [Events] AS e
WHERE
    e.EventID IN (
        SELECT EventID
        FROM #EventTemp
        UNION
        SELECT EventID
        FROM PrevEvent
    )

SQL Server – Calculate Days Difference Between Two Dates Per Month

Pretending this is your real table:

CREATE TABLE #Employees
(
  EmployeeID INT PRIMARY KEY,
  StartDate  SMALLDATETIME,
  EndDate    SMALLDATETIME
);

INSERT #Employees(EmployeeID, StartDate, EndDate)
  VALUES(1,'20150621','20150821');

Then this is one approach to get the results you're after (you'll need to replace #Employees with dbo.YourRealTableName of course, and this assumes you won't have anyone who has worked more months than the number of rows in your version of sys.all_columns - if you have a Numbers table that is a better approach). I added year to the output because, hopefully, you have employees that last longer than 12 months and probably some of those will last past January 1st - so being able to tell which June is which might be important.

;WITH x AS 
(
  SELECT rn1 = ROW_NUMBER() OVER (ORDER BY [object_id])-1
  FROM sys.all_columns
),
y AS 
(
  SELECT
    e.EmployeeID,
    x.rn1, 
    rn2 = ROW_NUMBER() OVER (PARTITION BY e.EmployeeID ORDER BY x.rn1 DESC), 
    e.StartDate, 
    EndDate = COALESCE(e.EndDate, DATEADD(DAY,DATEDIFF(DAY,0,GETDATE()),0)),
    sm = DATEADD(DAY, 1-DAY(StartDate), StartDate)
  FROM x INNER JOIN #Employees AS e
  ON x.rn1 <= DATEDIFF(MONTH, e.StartDate, COALESCE(e.EndDate, GETDATE()))
),
z AS
(
  SELECT 
    EmployeeID,
    s = CASE rn2 WHEN 1 THEN StartDate 
      ELSE DATEADD(MONTH, rn2-1, sm) END,
    e = CASE rn1 WHEN 0 THEN DATEADD(DAY, 1, EndDate)
      ELSE DATEADD(MONTH, rn2, sm) END
  FROM y
)
SELECT 
  EmployeeID,
  [Month] = DATENAME(MONTH, s), 
  [Year] = YEAR(s),
  WorkingDays = DATEDIFF(DAY, s, e)
FROM z
ORDER BY EmployeeID, s;
GO

This works for all employees, but you can easily add a filter inside y to limit it to a single employee or a subset. You might also want to filter for only those employees with an EndDate; it wasn't clear to me what you wanted to do with employees who still work there, so I assumed you would want to count working days up until today.

Don't forget to clean up:

DROP TABLE #Employees

Edit for updates (please stop changing the requirements):

CREATE TABLE #Employees
(
  EmployeeID INT PRIMARY KEY,
  StartDate  SMALLDATETIME,
  EndDate    SMALLDATETIME,
  Salary     INT
);

INSERT #Employees(EmployeeID, StartDate, EndDate, Salary)
  VALUES(1,'20150621','20150821',620);

Query:

;WITH x AS 
(
  SELECT rn1 = ROW_NUMBER() OVER (ORDER BY [object_id])-1
  FROM sys.all_columns
),
y AS 
(
  SELECT
    e.EmployeeID, e.Salary, x.rn1, 
    rn2 = ROW_NUMBER() OVER (PARTITION BY e.EmployeeID ORDER BY x.rn1 DESC), 
    e.StartDate, 
    EndDate = COALESCE(e.EndDate, DATEADD(DAY,DATEDIFF(DAY,0,GETDATE()),0)),
    sm = DATEADD(DAY, 1-DAY(StartDate), StartDate)
  FROM x INNER JOIN #Employees AS e
  ON x.rn1 <= DATEDIFF(MONTH, e.StartDate, COALESCE(e.EndDate, GETDATE()))
),
z AS
(
  SELECT 
    EmployeeID, Salary,
    StartDate, EndDate,
    s = CASE rn2 WHEN 1 THEN StartDate 
      ELSE DATEADD(MONTH, rn2-1, sm) END,
    e = CASE rn1 WHEN 0 THEN DATEADD(DAY, 1, EndDate)
      ELSE DATEADD(MONTH, rn2, sm) END
  FROM y
)
SELECT 
  EmployeeID, 
  [Month] = DATENAME(MONTH, s), 
  [Year] = YEAR(s),
  WorkingDays = DATEDIFF(DAY, s, e),
  Portion = Salary * 1.0 * DATEDIFF(DAY, s, e) / (1+DATEDIFF(DAY, StartDate, EndDate))
FROM z
ORDER BY EmployeeID, s;

Results I get (there are decimals because your salary won't always be nicely divisible by the number of working days):

EmployeeID  Month   Year  WorkingDays  Portion
----------  -----   ----  -----------  ----------------
1           June    2015  10           100.000000000000
1           July    2015  31           310.000000000000
1           August  2015  21           210.000000000000

Again, clean up:

DROP TABLE #Employees

Best Answer

Related Solutions

Sql-server – the best way to get all data for a date range, plus the last event just before the range

SQL Server – Calculate Days Difference Between Two Dates Per Month

Related Question