I was checking selectivity of some columns for an index.
Where is this "ignore what I give you" behaviour documented?
This gives 4,851,908, 4,841,060, and 1,000,052
SELECT
COUNT(*),
COUNT(DISTINCT Col1), COUNT(DISTINCT Col2)
FROM Sometable;
This gives 4,843,634 unique pairs as per MySQL extension
SELECT COUNT(DISTINCT Col1, Col2) FROM Sometable
The following are wrong: the individual COUNT(DISTINCT colx) all give the 4,843,634 unique pair count regardless of any filler column or expression order.
I expected COUNT(DISTINCT Col1) = 4,841,060
, and COUNT(DISTINCT Col1) = 1,000,052
.
SELECT COUNT(DISTINCT Col1), COUNT(DISTINCT Col2) FROM Sometable
SELECT COUNT(DISTINCT Col2), COUNT(DISTINCT Col1) FROM Sometable
SELECT COUNT(DISTINCT Col1), 1 AS Filler, COUNT(DISTINCT Col2) FROM Sometable
But this give correct values again with another aggregate (like with COUNT(*)
above)
SELECT COUNT(DISTINCT Col1), MAX(col1) AS Filler, COUNT(DISTINCT Col2) FROM Sometable
Questions, in case it wasn't clear:
- Why does
COUNT(DISTINCT Col1), COUNT(DISTINCT Col2)
behave likeCOUNT(DISTINCT Col1, Col2)
- Why is another aggregate required to make it work?
Best Answer
It looks like you are hitting this regression bug:
One of the suggested workarounds is to use sql_buffer_result