Mysql – Basic question about grouping and counts

countgroup byMySQL

I'm having trouble coming up with a query in MySQL.

Suppose we have a table that logs an action by a user.

  • User1 is a Doctor and is logged several times.
  • User2 is a Doctor and is logged once.
  • User3 is an Accountant and is logged twice.
  • User4 is a Clerk and hasn't been logged.
  • User5 is an Accountant and hasn't been logged.

I would like to know what "roles" (i.e., doctor, accountant, clerk) have entries in the log file.

Here is the schema:

CREATE TABLE `test1_users` (
  `id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(45) NOT NULL,
  `role` VARCHAR(45) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE = InnoDB;

CREATE TABLE `test1_log` (
  `id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
  `user_id` INTEGER UNSIGNED NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE = InnoDB;

And test data:

INSERT INTO `test1_users` VALUES
(1, 'Franklin', 'Doctor'),
(2, 'Sally', 'Doctor'),
(3, 'Linus', 'Accountant'),
(4, 'Patty', 'Clerk'),
(5, 'Marcie', 'Accountant');

INSERT INTO `test1_log` VALUES
(1, 1),
(2, 1),
(3, 2),
(4, 1),
(5, 3),
(6, 3);

If I want to see how many times a user has been logged, it's pretty straight forward.

SELECT u.name, COUNT(*) AS total
FROM test1_log l
LEFT JOIN test1_users u ON u.id = l.user_id
GROUP BY user_id;

The query produces the following:

+----------+-------+
| name     | total |
+----------+-------+
| Franklin |     3 |
| Sally    |     1 |
| Linus    |     2 |
+----------+-------+

But how would I structure the query if I want to see how many different "roles" have been logged?
Not how many times each role is logged, just a list of all the roles that are logged.

Specifically, how many different Doctors are logged? I don't care how many times each has been logged.

Such a query I'm looking for would produce the following output:

+------------+-------+
| role       | total |
+------------+-------+
| Doctor     |     2 |
| Accountant |     1 |
| Clerk      |     0 |
+------------+-------+

Both Doctors were logged, but only one Accountant and no Clerks.

Here's the closest I've come:

SELECT role, COUNT(l.id) AS total FROM test1_users u 
LEFT JOIN test1_log l ON u.id = l.user_id 
GROUP BY role;

But this only gives me the number of log entries, and I need the number of roles that made it into the log.

Thoughts?

Thanks.

Best Answer

By my reading of your question, you want the numbers of distinct users who have been logged, broken down by each user's role.

Your query returns the number of log entries, broken down by role.

To get distinct users, we'd need to get a COUNT(DISTINCT x) on some column x, which represents users.

If we used a column from test1_users, we'd just get the count of users assigned to each role, without reference to whether or not the user had associated entries in the log.

If we use an INNER JOIN, we'll only count users who do have entries in the log - but we won't see any roles where no user exists in the log (as is true for "Clerk").

However, there's a columns in test1_log that's specific to a user - test1_log.user_id.

So, if we modify the query to be:

SELECT role, COUNT(DISTINCT l.user_id) AS total FROM test1_users u 
LEFT JOIN test1_log l ON u.id = l.user_id 
GROUP BY role;

we get the results you request (as you can see in this SQLFiddle.