MySQL boolean field search vs indexed varchar performance


Will a select filtering on a boolean field be faster than a select filtering on an indexed varchar(255) field?

MySQL 5.7

In this case, records of a certain table should never be deleted. Instead they should be marked as inactive if necessary.

A boolean 'is_active' field could be added to that table, or an existing 'status' indexed varchar field could be used (conceptually overloaded) to also store the value 'inactive' if necessary. The varchar is specified as 255 max length, but realistically it is much smaller.

In this case, 90+% of the rows will be considered active, so the boolean is_active would be 0, or alternately the status would have some value other than 'inactive'; and most searches would be for is_active = 1 (ignoring inactive rows).

I have read that there is little value in indexing boolean fields, so whether the boolean option were indexed or not probably wouldn't be a factor (but if it is, please enlighten me!)

Best Answer

Short answer for bool vs varchar: Probably not much diff, but it may depend.

Short answer for bool that is true 90% of the time: The index is useless.

Long answer...

What percentage of the rows are "is_active"? If it is more than about 20%, the index won't be used!.

Use reasonable limits on VARCHAR. Even without that, 'inactive' is only 8 characters, not 255. So, WHERE status = 'inactive' is not much slower than WHERE is_active.

Is that the entire WHERE clause? I ask because Optimization requires looking at the entire WHERE clause. For that matter, then entire SELECT can be important.

Often the query involves

WHERE is_active
  AND other-stuff

In which case, this may be very beneficial, even with 90% being active:

INDEX(is_active, other-column)

But, again, I need to see the details before giving a definitive answer.

And, no, the poor cardinality of is_active is not relevant in a composite index.