MySQL – How Group By Affects Aggregate Functions

aggregategroup byMySQL

I tried to understand the following scenarios:

Table name: family
+------+-------+
| id   | name  |
+------+-------+
|    1 | alice |
|    2 |  bob  |
+------+-------+

That table will be used for illustration purposes.

Server version: 5.6.20 MySQL Community Server (GPL)

Lets try:

SELECT id, name, COUNT(name) FROM family WHERE id = 50;

The query will result in:

+------+------+-------------+
| id   | name | COUNT(name) |
+------+------+-------------+
| NULL | NULL |           0 |
+------+------+-------------+

My first question is about how I got this result?, what I think is that:

The COUNT(name) gets called after the query(internally) resolved into:

+------+------+
| id   | name |
+------+------+
| NULL | NULL |
+------+------+

Now at this point the COUNT(name) is ignoring the NULL value and that is why I get this extra column:

+-------------+
| COUNT(name) |
+-------------+
|           0 |
+-------------+

I only want to understand the steps that build the result.

The following step is taken by me so I can get Empty set and not the table above.

So to get Empty set as the result I add:

SELECT id, name, COUNT(name) FROM family WHERE id=50 GROUP BY id;

At this point I pretty desperate, not just I couldn't understand the problem I'm lost with how the solution is even working.

My second question is about the solution, again my focus is on the process, the actual steps the database has taken to achive that result.

If anyone can walk me step by step I will be very thankful.

Best Answer

There are several aspects of your query. The first is that mysql by default will give you a random answer if there are several possible answers to a group by question. SQL92 demanded that all non aggregated columns in the select clause must be part of the group by clause. SQL99 loosened this restriction and requires that all non aggregated columns are functionally dependent of the group by clause. By default mysql ignores both of these and returns a random row if there are several possible ones. You can force the SQL92 behaviour by adding ONLY_FULL_GROUP_BY to @@sql_mode, most DBMS's use such implementation. A small example:

create table t (x int not null, y int not null);
insert into t (x,y) values (1,1),(1,2);

For a query like:

select x,y, count(*) from t group by x;

There are two possible answers:

1 | 1 |        2
1 | 2 |        2

Mysql will randomly return one of them. In some sense the result is undefined.

To see what happens when there is no group by clause at all involved, let's add another row to the equation:

insert into t (x,y) values (2,3);

select x,y, count(*) from t

Since we have no group by count(*) will be applied to the whole result, for x and y there are 3 possible combinations:

1 |    1 |        3
1 |    2 |        3
2 |    3 |        3

Once again mysql will shoot you in the foot by randomly returning one of these rows, say:

1 |    1 |        3

For an empty set the aggregate function count(*) should be 0 (WHERE is evaluated before GROUP BY), but there are no x and y:

select x,y, count(*) from t where x=4;
+------+------+----------+
| x    | y    | count(*) |
+------+------+----------+
| NULL | NULL |        0 |
+------+------+----------+

so they are shown as non-existing (NULL).

The short answer is that you get a non-sense answer because you asked a non-sense question ;-)

The scenario you describe is is probably the one thing that causes the most confusion in the mysql community. I wish that ONLY_FULL_GROUP_BY where the default behaviour and that there where a setting ALLOW_PARTIAL_GROUPING that could be used for backward compability.

Related Solutions

Mysql – One to many to many

You can simplify that specific query:

SELECT COUNT(1)
FROM results r 
JOIN steps s 
    ON s.result_id=r.id
WHERE r.monitor_id = 64 
  AND r.completed >= '2014-10-01' 
  AND r.completed <= '2014-10-03'

If you don't have it already, try indexes like:

CREATE INDEX ... ON results (monitor_id, completed, id);
CREATE INDEX ... ON steps (result_id, id);

MySQL 5.1 – Fixing Binlog_Do_DB Issue in Master Configuration

In Windows, you would need to modify the my.ini file and not the my.cnf file

Locate you my.ini file

If MySQL is being run as a service

Enter "services.msc" on the Start menu search box.
Find MySQL service under Name column, for example, MySQL56.
Right click on MySQL service, and select Properties menu.
Look for "Path To Executable" under General tab, and there is your .ini file, for instance, "C:\Program Files (x86)\MySQL\MySQL Server 5.6\bin\mysqld.exe" --defaults-file="C:\ProgramData\MySQL\MySQL Server 5.6\my.ini" MYSQL56

Otherwise the my.ini file is probably in a sub-directory of D:\xampp\mysql\

Once you've added in the binlog_do_db parameter, be sure to restart the MySQL service.

Edit by RolandoMySQLDBA

You can restart mysql from services.msc window by right clicking and choosing or you can login from DOS CommandLine as Administrator as run

C:\> net stop mysql
C:\> net start mysql

Also, make sure your settings are under the [mysqld] group header in my.ini

Best Answer

Related Solutions

Mysql – One to many to many

MySQL 5.1 – Fixing Binlog_Do_DB Issue in Master Configuration

Related Question