T-sql – TSQL SELECT statement to list a column calculated based on the value of a field

t-sql

I have a table named Table1, with three fields Project ID, EmployeeID, Starting Date.
Need a select statement to list all columns of the table, and an additional column to show the number of project by employee since a year. See below figure.

It should be enough to get the count(distinct EmployeeID) where StartingDate > 1/1/2016. But, I can't figure out the query. Could anyone please help?

Best Answer

If the project id is distinct this is easy to achieve with two window functions. Maybe not the best performance but just flows out of my fingers. Feel free to improve:

Setting up

CREATE TABLE #Table1
(
 ProjectID int,
 EmployeeID int,
 StartingDate DateTime
);

INSERT INTO #Table1 (ProjectID, EmployeeID, StartingDate)
VALUES (1, 1001, '1/1/2005'),
       (2, 1003, '1/1/2019'), 
       (3, 1007, '1/1/2012'),
       (4, 1001, '1/1/2017'),
       (5, 1002, '1/1/2018'), 
       (6, 1003, '1/1/2015'), 
       (7, 1001, '1/1/2016'), 
       (8, 1002, '1/1/2018'), 
       (9, 1004, '1/1/2018'), 
       (10, 10001, '1/1/2018');

The query;

WITH cte_Count
AS
(
  SELECT t.ProjectID,
         COUNT(*) OVER (PARTITION BY t.EmployeeID ORDER BY t.StartingDate) AS project_count
  FROM #Table1 t
  WHERE t.StartingDate > '1/1/2016'
)
SELECT t.ProjectID,
       t.EmployeeID,
       t.StartingDate,
       MAX(project_count) OVER (PARTITION BY t.EmployeeID) AS project_count_since_2016
FROM #Table1 t
LEFT JOIN cte_Count c ON c.ProjectID = t.ProjectID

The first part just filters the projects that are not in the range you want. The second joins this information with the source table.

Related Solutions

Sql-server – How to check for a null or empty table-valued parameter

A table can't be NULL, nor can a TVP. How do you check if a table is empty? You certainly don't say IF Sales.SalesOrderHeader IS NULL. :-)

IF EXISTS (SELECT 1 FROM @tvp)
BEGIN
  -- lots of expensive processing
END
ELSE
BEGIN
  -- a little bit of cheap processing
END
...

Sql-server – the best way to get all data for a date range, plus the last event just before the range

I am going to assume that there isn't an index on the date columns, otherwise I think that the query would have been structured differently. If there is, you can probably find a better performing one than this.

The advantage of this query is that it can get all the data in one scan. The disadvantage is that it has to sort the data and join EventEmployee on the entire table. So as always, test with your own situation. This query also assumes that the MAX date is either unique or that equivalent rows would be acceptable.

USE AdventureWorks2012
GO
;
WITH Base AS (
   SELECT 
      TransactionHistory.*
      ,ProductVendor.BusinessEntityID
      ,MAX(CASE WHEN TransactionDate < '2008-08-01' THEN TransactionDate END) 
           OVER (PARTITION BY ProductVendor.BusinessEntityID) AS PreviousVendorTransaction
      ,COUNT(CASE WHEN TransactionDate >= '2008-08-01' THEN 1 END ) 
           OVER (PARTITION BY ProductVendor.BusinessEntityID) AS VendorAfterCutoff
   FROM
      Production.TransactionHistory
      -- Doesn't make the most sense, but I need a repeating relation
      INNER JOIN Purchasing.ProductVendor
         ON TransactionHistory.ProductID = ProductVendor.ProductID
),
Filtered AS (
   SELECT
      *
   FROM
      Base
   WHERE
      Base.TransactionDate >= '2008-08-01'
      OR (TransactionDate = PreviousVendorTransaction AND VendorAfterCutoff > 0)
)
SELECT DISTINCT
   TransactionID
   ,ProductID
   ,ReferenceOrderID
   ,ReferenceOrderLineID
   ,TransactionDate
   ,TransactionType
   ,Quantity
   ,ActualCost
   ,ModifiedDate
FROM
   Filtered

Edit:

Hmm, I think I may have to take back my comment on structuring it differently if there are indexes. The other suggestions that I have are probably fairly minor.

Make sure the query is using the indexes you're expecting it to. Start and End date to build temp table, end date to drive the previous event loop.
If the query to build the temp table is doing a lookup on the clustered index, it may be better to hold off and do that as part of the main query.
Try using a cte instead of a temp table. I think that a cte might be more competitive with the way that the query is structured below.
If you are returning a lot of events, it might be better to pull out the event table lookup to the main query to give the optimizer the option of doing a merge join.
I don't see a way of optimizing the previous event lookup short of an indexed view.

Here's a query that combines a few of those ideas.

SELECT
    e.[EventID]
INTO #EventTemp
FROM
    [Events] AS e
WHERE
    ( e.[EventStart] >= @StartDate AND e.[EventStart] <= @EndDate )
    OR ( e.[EventEnd] >= @StartDate AND e.[EventEnd] <= @EndDate )

;
WITH PrevEvent AS (
    SELECT
        EmpPrevEvent.[EventID]
    FROM
    (
        SELECT DISTINCT
            ee.[EmployeeID]
        FROM
            #EventTemp
            INNER JOIN [EventEmployee] AS ee ON
                #EventTemp.[EventID] = ee.[EventID]
    ) AS Emp
    CROSS APPLY (
        SELECT TOP 1
            e.[EventID]
        FROM
            [Events] AS e
            INNER JOIN [EventEmployee] AS ee ON
                e.[EventID] = ee.[EventID]
        WHERE
            ee.[EmployeeID] = Emp.[EmployeeID]
            AND e.[EventEnd] < @StartDate
        ORDER BY 
            e.[EventEnd] DESC
    ) AS EmpPrevEvent
)
SELECT
    e.[EventID],
    e.[EventStart],
    e.[EventEnd],
    e.[EventTypeID]
FROM
    [Events] AS e
WHERE
    e.EventID IN (
        SELECT EventID
        FROM #EventTemp
        UNION
        SELECT EventID
        FROM PrevEvent
    )

Best Answer

Related Solutions

Sql-server – How to check for a null or empty table-valued parameter

Sql-server – the best way to get all data for a date range, plus the last event just before the range

Related Question