Sql-server – group by excluding a column

aggregatectesql serversql-server-2012t-sql

I have this code:

insert into [dbo].[NGC_Agent_Intervals]
SELECT  
    AgentID, 
    date, 
    [Hour] = DATEPART(HOUR, RSRange),
    hour_quarter = CONCAT(DATEPART(MINUTE, RSRange) ,'-', DATEPART(MINUTE, RERange)),
    DepartmentID,
    [DepartmentName], 
    [AgentState], 
    [AgentStateReason], 
    ( select AgentStateReasonDescription from RealRanges) as AgentStateReasonDescription,
    [ExtensionID],
    [WorkstationID],

    Duration = sum(CASE 
      WHEN RSRange <= aStart and aend >= RERange THEN DATEDIFF(SECOND, aStart, DATEADD(MINUTE, 15, RSRange)) 
      when  RSRange <= aStart and aend < RERange THEN DATEDIFF(SECOND, aStart, aEnd) 
      WHEN RERange >= aEnd THEN DATEDIFF(SECOND, RSRange, aEnd) 
      ELSE DATEDIFF(SECOND, RSRange, RERange)
    END)
 FROM RealRanges

 WHERE DATEDIFF(SECOND, RSRange, aEnd) > 0
   AND AgentState NOT IN ('logout', 'LOGIN')


 GROUP BY duration,RealRanges.AgentID, RealRanges.date, RealRanges.hour,
  DepartmentID,[DepartmentName], [AgentState], [AgentStateReason],
  [ExtensionID],[WorkstationID], RSRange, RealRanges.RERange

I run this code after using a cte to split the original rows from a table to a few rows by hour quarters.
What I want to do is to sum the duration value, if a few columns are matching in a few rows.

That's the original table values and I want to sum the duration:
original table
if i dont sum the duration it looks like this:
before sum
It should look like this after the sum:

new table

The only problem is that I want to group by only for the following columns:

AgentID Date hour Hour_Quarter AgentState AgentStatereason

and exclude the other columns so even if the extensionid is different (for example) I still get the sum from 2 columns.
I tried doing that with a sub-query (in the code above) but I get an error:
error

So what am I missing?

Best Answer

To avoid grouping by a specific column that returns multiple values, you can either remove it from the query, or you can explicitly tell it which value you want. You can do this using aggregate or analytic functions, like:

For numeric values you can also present new values, such as:

  • SUM(Duration)
  • AVG(Duration)

But basically, if you want to show a single row collapsed from two different ExtensionID values, you need to define how to do that. In this case, is ExtensionID important? If it is, do you want to show the most recent one? The first one? The one that occurs most frequently? What about ties? There are solutions to all of these, but you have to know what you want your query to return.

You can't just leave things out of the GROUP BY like you can with MySQL, where you get - let's say - weird results.