T-sql – Issue with SUM OVER PARTITION BY in conjunction with GROUP BY

cross-applygroup bypivott-sql

I am getting a typical group by error that one would see when a field is not in a group by or an aggregate in the select.

Column 'v.EDTAmount' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Below is some sample code and data. I am trying to stay away from only using group by because I have ~30 columns that needed to be included, and having 25 columns in a group by seems…unfortunate.

In the INSERT INTO data you can see that Payroll 0048 has two records, and the MULT_CHECK column has a value of 02, essentially meaning the person was paid in two different checks. Conceptually, I need those two records summed up into one for the end result.

However, things are complicated by the fact that I am using a cross apply to fill a business requirement that regular/over-time/double-time are expressed in new records instead of columns.

The first select statement is without the SUM and GROUP BY functions to show what the data kind of looks like, but without the desired grouping and sum. The second select statement attempts to get the desired sum and group by, but gives the error.

I am just having a heck of a time finding my problem. Could be an oversight or just a brain fart using CROSS APPLY wrong or something. I am also open using another method if there is a better way.

As a side note, the MAX(v.EDT) is just a hack so I do not have to group by that also.

CREATE TABLE #CHECKS(
    [EMP] [varchar](10) NULL,
    [PAYROLL] [varchar](6) NULL,
    [MULT_CHECK] [varchar](2) NULL,
    [REGULAR_GROSS] [money] NULL,
    [REGULAR_HOURS] [money] NULL,
    [Over_Time_GROSS] [money] NULL,
    [Over_Time_HOURS] [money] NULL,
    [Double_Time_GROSS] [money] NULL,
    [Double_Time_HOURS] [money] NULL
) 

GO

INSERT INTO #CHECKS
VALUES ('000102', '0042', '00', 2960.00, 173.33, 500.00, 10.00, 100.00, 1.00)
      ,('000102', '0044', '00', 2960.00, 173.33, 500.00, 10.00, 100.00, 1.00)
      ,('000102', '0048', '00', 2500.00, 140.00, 500.00, 10.00, 100.00, 1.00)
      ,('000102', '0048', '02',  460.00,  33.33,      0,     0,      0,    0)

GO



/**** Without the group by and sum, you can see that I have an extra 3 rows for payroll 0048 due to the ****/
/**** Multiple Check column. The 6 rows for payroll 0048 need to be grouped by employee ID, payroll, and EDTName ***/
SELECT     [Employee Id]
          ,[Payroll Name]
          ,v.EDT                                AS [E/D/T]                              
          ,v.EDTName                            AS [E/D/T Name]
          ,v.EDTAmount                          AS [E/D/T Amount]
          ,v.EDTHours                           AS [E Hours]
    FROM 
    (
        SELECT CK.EMP                           AS [Employee Id]
              ,CK.PAYROLL                       AS [Payroll Name]
              ,CK.REGULAR_GROSS
              ,CK.REGULAR_HOURS
              ,CK.Over_Time_GROSS
              ,CK.Over_Time_HOURS
              ,CK.Double_Time_GROSS
              ,CK.Double_Time_HOURS

        FROM #CHECKS CK
    )
        AS DerivedTable
    CROSS APPLY (
        VALUES 
              ('E', 'Regular Gross', regular_gross, REGULAR_HOURS)
             ,('E', 'Overtime Gross', OVER_TIME_GROSS, OVER_TIME_HOURS)
             ,('E', 'Doubletime_Gross', DOUBLE_TIME_GROSS, DOUBLE_TIME_HOURS)

    ) v (EDT, EDTName, EDTAmount, EDTHours)



/**** Stripped down version of my code that produces the error ********/
SELECT     [Employee Id]
          ,[Payroll Name]
          ,MAX(v.EDT)                                   AS [E/D/T]          --If deduction, then D, if earnings, then E, etc...                         
          ,v.EDTName                                    AS [E/D/T Name]
          ,SUM(v.EDTAmount) OVER (PARTITION BY [Employee Id], [payroll name], v.EDTName)    AS [E/D/T Amount]
          ,SUM(v.EDTHours)  OVER (PARTITION BY [Employee Id], [payroll name], v.EDTName)    AS [E Hours]
    FROM 
    (
        SELECT CK.EMP                                   AS [Employee Id]
              ,CK.PAYROLL                               AS [Payroll Name]
              ,CK.REGULAR_GROSS
              ,CK.REGULAR_HOURS
              ,CK.Over_Time_GROSS
              ,CK.Over_Time_HOURS
              ,CK.Double_Time_GROSS
              ,CK.Double_Time_HOURS

        FROM #CHECKS CK
    )
        AS DerivedTable
    CROSS APPLY (
        VALUES 
              ('E', 'Regular Gross', regular_gross, REGULAR_HOURS)
             ,('E', 'Overtime Gross', OVER_TIME_GROSS, OVER_TIME_HOURS)
             ,('E', 'Doubletime_Gross', DOUBLE_TIME_GROSS, DOUBLE_TIME_HOURS)

    ) v (EDT, EDTName, EDTAmount, EDTHours)

    GROUP BY [Employee Id], [Payroll Name], EDTName


    DROP TABLE #CHECKS

Best Answer

I think the group by should be applied first and only then you unpivot your results. The CROSS APPLY to unpivot multiple columns like this it's OK. I use the exact same technique without issues. You mention that you have more columns not shown here, but I think that probably they have the same values so you could include them from the beginning using MAX/MIN.

SELECT  [Employee Id],
        [Payroll Name],
        v.EDT           AS [E/D/T],
        v.EDTName       AS [E/D/T Name],
        v.EDTAmount     AS [E/D/T Amount],
        v.EDTHours      AS [E Hours]
FROM    (
            SELECT  CK.EMP                      AS [Employee Id],
                    CK.PAYROLL                  AS [Payroll Name],
                    SUM(CK.REGULAR_GROSS)       AS REGULAR_GROSS,
                    SUM(CK.REGULAR_HOURS)       AS REGULAR_HOURS,
                    SUM(CK.Over_Time_GROSS)     AS Over_Time_GROSS,
                    SUM(CK.Over_Time_HOURS)     AS Over_Time_HOURS,
                    SUM(CK.Double_Time_GROSS)   AS Double_Time_GROSS,
                    SUM(CK.Double_Time_HOURS)   AS Double_Time_HOURS
            FROM    #CHECKS CK
            GROUP BY
                    CK.EMP, CK.PAYROLL 
        ) DerivedTable
        CROSS APPLY (
            VALUES
                ('E', 'Regular Gross', regular_gross, REGULAR_HOURS),
                ('E', 'Overtime Gross', OVER_TIME_GROSS, OVER_TIME_HOURS),
                ('E', 'Doubletime_Gross', DOUBLE_TIME_GROSS, DOUBLE_TIME_HOURS)
        ) v(EDT, EDTName, EDTAmount, EDTHours)

Is that what you expected?