MySQL aggregate anomaly

aggregateMySQL

I was checking selectivity of some columns for an index.
Where is this "ignore what I give you" behaviour documented?

This gives 4,851,908, 4,841,060, and 1,000,052

SELECT
     COUNT(*), 
     COUNT(DISTINCT Col1), COUNT(DISTINCT Col2)
FROM Sometable;

This gives 4,843,634 unique pairs as per MySQL extension

SELECT COUNT(DISTINCT Col1, Col2) FROM Sometable

The following are wrong: the individual COUNT(DISTINCT colx) all give the 4,843,634 unique pair count regardless of any filler column or expression order.

I expected COUNT(DISTINCT Col1) = 4,841,060, and COUNT(DISTINCT Col1) = 1,000,052.

SELECT COUNT(DISTINCT Col1), COUNT(DISTINCT Col2) FROM Sometable

SELECT COUNT(DISTINCT Col2), COUNT(DISTINCT Col1) FROM Sometable

SELECT COUNT(DISTINCT Col1), 1 AS Filler, COUNT(DISTINCT Col2) FROM Sometable

But this give correct values again with another aggregate (like with COUNT(*) above)

SELECT COUNT(DISTINCT Col1), MAX(col1) AS Filler, COUNT(DISTINCT Col2) FROM Sometable

Questions, in case it wasn't clear:

  • Why does COUNT(DISTINCT Col1), COUNT(DISTINCT Col2) behave like COUNT(DISTINCT Col1, Col2)
  • Why is another aggregate required to make it work?

Best Answer

It looks like you are hitting this regression bug:

select count(distinct N1), count(distinct N2) from test.AA" works incorrectly
...
"This bug happens when a unique index exists"

One of the suggested workarounds is to use sql_buffer_result