MySQL – Index of Non-Empty Columns

clustered-indexhashingindexMySQL

I am trying to enhance the search performance of a table based on whether certain columns contain values.

I have the following columns :

id |
link |
text |
date |
lat |
long |
type

If a record has a value in lat & long it can be shown on a map. It’s considered a “dot”

If a record has a value in date & lat & long it is considered an “event”

If a record has a value in type and text it is considered “news”

With over 1 million records in the table, what is the best way to index so I can quickly extract all “dots” or all “events” or all “news”?

To reiterate, I don’t care to index the actual values of these columns, only if there is ANY value in that column.

I tried creating an index of multiple columns but it used the values of the columns, and so many often unique values resulted in slow performance.

I considered using a “hash” based on which columns were not null, but I couldn’t extract those 3 variations beaded on the hash result.

Any advice appreciated.

Best Answer

Use binary (or some integer as binary) or SET generated encoding column. For example:

CREATE TABLE tablename 
( id SERIAL PRIMARY KEY,
  linkfield  VARCHAR(255),
  textfield TEXT,
  datefield DATE,
  lat DECIMAL(10,8),
  long DECIMAL(10,8),
  type ENUM('value1', 'value2', ... , 'valueN'),
  record_type_mask TINYINT UNSIGNED
                   AS ( (linkfield  IS NOT NULL) *  1 +
                        (textfield  IS NOT NULL) *  2 + 
                        (datefield  IS NOT NULL) *  4 + 
                        (lat + long IS NOT NULL) *  8 +
                        (type       IS NOT NULL) * 16  
                      )
);

then use either constant values.

For example, if you want to select

If a record has a value in date & lat & long it is considered an “event”

you will use

WHERE record_type_mask = 12

because the value 4 marks present lat and long, and the value 8 marks datefield.

If you need records which have datefield set you may use masking:

WHERE record_type_mask & 4
-- or 
WHERE record_type_mask IN (4,5,6,7,12,13,14,15,20,21,22,23,28,29,30,31)