T-SQL – How to Count in Single SELECT Statement

countgroup byselectt-sql

I have a table called Catalog. It has multiple columns, for example CreatedByID and ModifiedByID that contain user IDs. I would like to get count of how many documents a user has created and edited.

I've spent hours looking for alternatives for this but couldn't find anything that works. This works and I get correct results but is there a simpler way to do this?

SELECT l.UserName
     , US.Used
     , US.Created
     , US.MostRecent
     , US.Edited
    FROM [Users] US
    JOIN    (SELECT UserName
                  , COUNT(UserName) AS Used
                  , MAX([TimeStart]) AS MostRecent 
             FROM [Log] 
             WHERE ReportAction = 1 
             GROUP BY UserName) AS l
        ON (Us.UserName = l.UserName)

    JOIN    (SELECT CreatedByID
                  , COUNT(CreatedByID) AS Created 
             FROM [Catalog] 
             GROUP BY CreatedByID) AS Cat
        ON (Us.UserID = Cat.CreatedByID)

    JOIN    (SELECT ModifiedByID
                  , COUNT(ModifiedByID) AS Edited 
             FROM [Catalog] 
             GROUP BY ModifiedByID) AS Cat2
        ON (Us.UserID = Cat2.ModifiedByID)

I would like to get both COUNT(CreatedByID) and COUNT(ModifiedByID) in a single select. For example I can get how many documents user has created with this:

SELECT CreatedByID, COUNT(CreatedByID) AS Created
FROM [Catalog]
GROUP BY CreatedByID

And how many documents user has edited with this:

SELECT ModifiedByID, COUNT(ModifiedByID) AS Modified
FROM [Catalog]
GROUP BY ModifiedByID

But how to do those on a single select? I've tried something like this, but the results are not correct any more for ModifiedByID.

 SELECT CreatedByID, COUNT(CreatedByID) AS Created, COUNT(ModifiedByID) AS Modified
FROM [Catalog]
GROUP BY CreatedByID

Best Answer

What you are trying to do can be resolved using two operations: an unpivot followed by a pivot.

The table that is most principal to the query you are looking for, Catalog, is document-centric. It stores data in this form:

What | Created by Whom | Edited by Whom

You, however, want a user-centric report. So, first of all, you need to transform the original set to a user-centric one:

Who | Did What | To What

The To What part is not strictly necessary in this case, perhaps, but I leave it for a better illustration of how the data are rearranged in the course of the sought-for transformation. And the transformation itself is called unpivot. Assuming the t-sql tag on your question means SQL Server rather than Sybase, an unpivot in your case can be implemented like this:

SELECT
  u.UserID,
  Action = LEFT(u.Action, LEN(u.Action) - 4)
FROM
  dbo.Catalog AS c
  UNPIVOT
  (
    UserID
    FOR Action IN (CreatedByID, ModifiedByID)
  ) AS u
;

The LEFT() trick simply cuts off the ByID bit from each name, leaving just the verb.

Another way would be like this:

SELECT
  u.UserID,
  u.Action
FROM
  dbo.Catalog AS c
  CROSS APPLY
  (
    VALUES
      (CreatedByID,  'Created'),
      (ModifiedByID, 'Modified')
  ) AS u (UserID, Action)
WHERE
  u.UserID IS NOT NULL
;

I am assuming that a document that has not been modified will have a null in ModifiedByID. Filtering out the Modified entries with a null user ID is what the WHERE clause in the above query for.

Now you could simply group the results by UserID, Action and count the rows in each group. That would give you the output values in this form:

UserID  Action    Count
------  --------  -----
100     Created   10
100     Modified  15
110     Created   8
...     ...       ...

But as you want Created and Modified as separate columns:

UserID  Created  Modified
------  -------  --------
100     10       15
...     ...      ...

you will need to group the results with a pivot. One way to do so is to use the native PIVOT syntax:

SELECT
  UserID,
  Created,
  Modified
FROM
  (
     ... /* the unpivot query (either option) */
  ) AS s
  PIVOT
  (
    COUNT(Action)
    FOR Action IN (Created, Modified)
  ) AS p
;

You could also achieve the same using the older syntax called conditional aggregation:

SELECT
  UserID,
  Created  = COUNT(CASE Action WHEN 'Created'  THEN 1 END),
  Modified = COUNT(CASE Action WHEN 'Modified' THEN 1 END)
FROM
  (
     ... /* the unpivot query (either option) */
  ) AS s
GROUP BY
  UserID
;

In the latter case the derived table can be avoided. Instead you would just need to use the FROM clause and, if necessary, the WHERE clause from either unpivoting method. That is, the final conditional aggregation query would look either like this:

SELECT
  u.UserID,
  Created  = COUNT(CASE u.Action WHEN 'Created'  THEN 1 END),
  Modified = COUNT(CASE u.Action WHEN 'Modified' THEN 1 END)
FROM
  dbo.Catalog AS c
  UNPIVOT
  (
    UserID
    FOR Action IN (CreatedByID, ModifiedByID)
  ) AS u
GROUP BY
  u.UserID
;

or like this:

SELECT
  u.UserID,
  Created  = COUNT(CASE u.Action WHEN 'Created'  THEN 1 END),
  Modified = COUNT(CASE u.Action WHEN 'Modified' THEN 1 END)
FROM
  dbo.Catalog AS c
  CROSS APPLY
  (
    VALUES
      (CreatedByID,  'Created'),
      (ModifiedByID, 'Modified')
  ) AS u (UserID, Action)
WHERE
  u.UserID IS NOT NULL
GROUP BY
  u.UserID
;

You can use the final query as a derived table to join the results to the Users table to further connect them with the results from the Log table – in much the same way as you are doing now.