Mysql – multiple MYSQL full text indexing for innodb and memory + disk efficiency

full-text-searchinnodbMySQL

I have the following table: (Create table statement)

CREATE TABLE `phppos_people` (
  `first_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `last_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `phone_number` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `email` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `address_1` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `address_2` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `city` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `state` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `zip` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `country` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `comments` text COLLATE utf8_unicode_ci NOT NULL,
  `image_id` int(10) DEFAULT NULL,
  `person_id` int(10) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`person_id`),
  KEY `phppos_people_ibfk_1` (`image_id`),
  FULLTEXT KEY `full_search` (`first_name`,`last_name`,`email`,`phone_number`),
  FULLTEXT KEY `first_name_search` (`first_name`),
  FULLTEXT KEY `last_name_search` (`last_name`),
  FULLTEXT KEY `full_name_search` (`first_name`,`last_name`),
  FULLTEXT KEY `email_search` (`email`),
  FULLTEXT KEY `phone_number_search` (`phone_number`),
  CONSTRAINT `phppos_people_ibfk_1` FOREIGN KEY (`image_id`) REFERENCES `phppos_app_files` (`file_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |

I have requirements for a fast search using first name, last name, email and phone number at once; but also being able to search each individual field. I also like the relevancy sorting of full text which cannot be done with Btree

The queries work well but according to my cloud database provider they recommending btree indexes for better memory and disk usage.

Do I have any other options/configurations or way to create indexes that don't use a lot of disk + memory? I have about 40 full text indexes per database that follow a similar pattern. Performance is great with these indexes and has really sped up product.

Is Innodb full text designed to use a lot of memory + disk? Here is my mysql variable settings

https://gist.github.com/blasto333/aa4241a4e37447961188356719ea6984

Here are some stats from my provider

Upon review was getting an idea of the actual data set size as stored within InnoDB. It was a bit of a surprise when I found your dataset size much smaller than your on disk usage as listed below:

Such a discrepancy between the on disk and tablespace usage had me concerned initially that we had something orphaned, such as old temp tables, but later found this was being caused by a heavy usage of InnoDB Full Text indexes.

PHP-POS-DB-01
  On Disk:                              18GB
  Total DataSet Size:                   4.56GB
  Unused allocated Tablespace:          3.10GB  ## Fragmentation
  Total innodb used plus fragmentation: 7.66GB

PHP-POS-DB-2-01
  On Disk:                              9.7G
  Total DataSet Size:                   923.85MB
  Unused allocated Tablespace:          363.00MB
  Total innodb used plus fragmentation: 1.29GB

Best Answer

first_name has one word, correct? FULLTEXT(first_name) is not a use case for that type of indexing. I recommend you use simpleINDEXes` for any simple columns.

This works very efficiently, probably faster than anything that could be done with FULLTEXT:

WHERE last = 'Muench' AND first LIKE 'C%'
INDEX(last, first)

You might still want FULLTEXT(first, last, email, phone) for handling the case where any of those might be entered. (However, it is unclear whether this is a good use case.)

Meanwhile, don't blindly use (255) for all string columns; use sensible limits.

It is often wise to have the UI construct the WHERE clause based on what the user does/doesn't supply. In your case, this might include switching between MATCH..AGAINST, =, and LIKE, maybe even REGEXP.

You could also look for @ to discover emails, all-digits to discover phones, etc., thereby avoiding searching lastname for foo@bar.com.

As an alternative, you could add an extra column that contains all the 'words' for name, email, etc. Then have a single FULLTEXT index on that one column; no need for 40. The drawback is the need to maintain this redundant info. With a new enough MySQL or MariaDB, it could be a 'generated' column.