Mysql – Does creating an index on BIT(n) attribute in MySQL make sense

datatypesindexMySQL

I am working on a very specific thing and I need to use BIT(n) attributes and efficiently search through them. n is generally not a power of 2.

I've seen some mentions on the internet that BIT attributes are not index-friendly in MySQL. Without further elaboration though.

So the question is – do MySQL indexes work well with BIT(n) attributes or I better find some other ways, like restrict n to, say, 32 and use INT instead?

Best Answer

Unless you are querying for an exact match on the whole value stored in the bit column, an index will not help, because it can't generally be used. On the remote chance that you are, then it would.

The reasons an index can't be used when storing a bitmapped value and then querying based on the state of one or more bits include the fact that the commonly-available index types in MySQL are only useful for exact-value or range matching, and bitwise comparisons aren't truly either of those things.

Because of the nature of this limitation, it also holds true if you use any of the integer data types or even the SET data type.

Granted, for example, testing the state of the high bit on an 8-bit number is the same as testing whether the number is >= 128, which is indeed a range match and could be accomplished using a b-tree index, but to make use of an index for this test, the optimizer would have to understand that this what you're "really" asking when what you actually asked for is WHERE bin_col & b'10000000' ... the optimizer isn't going to realize that.

My inclination for storing bitmapped values would be to use an UNSIGNED [something]INT column, since BIT columns in MySQL are actually more analogous to CHAR/BINARY columns than they are to integer data, but this will ultimately depend on your application as well as your storage engine.

MyISAM apparently stores different bit columns together in the raw row data, so if you're not using even increments of 8/16/32/64, there might be a small storage advantage there when using BITs vs INTs -- but unless you're already using MyISAM, I wouldn't think this would be of enough of an advantage to make you consider using it.

The MEMORY and InnoDB engines allocate the smallest standard integer size that can hold the required number of bits in a BIT column.

The SET data type also stores the values as an unsigned integer of 1, 2, 3, 4, or 8 bytes, as required by the number of labels you define for the bits. It is easier on the eyeballs when you're looking at your stored bitmapped data, because if you select from it without casting the result back to an integer (either explicitly or implicitly with SELECT column_name + 0), you get back a comma-separated list of the labels of the bits that are set to 'on'... It doesn't offer any optimization in querying but it does give you the enhancement of expanding your bits in to labels without any real penalty compared to using a raw integer column.