Mysql – Improve MySQL Query Performance – Too many records

indexMySQLperformancequery-performancerdbms

I am using MySQL RDMS for a contact management system. Now a days my database grew in size and contains more than 1 million records. Checking duplicate phone numbers became a big issue during administration process. My server load drastically increases when I do search for phone numbers in the entire database. Since I dont want to keep duplicate phone records in the database, I used to check the existence of phone number in the whole database, which cause my application a little bit slower. My Question: How do I improve querying the entire database with high performance.

phone1 -> Datatype Varchar(10)

I tried indexing and it do a little for me. Any other ways to improve performance of my system.

Table Structure:

CREATE TABLE `phone_directory` (
  `lead_id` INT(9) UNSIGNED NOT NULL AUTO_INCREMENT,
  `list_id` BIGINT(14) UNSIGNED DEFAULT NULL,
  `gmt_offset_now` DECIMAL(4,2) DEFAULT '0.00',
  `first_name` VARCHAR(30) DEFAULT NULL,
  `middle_initial` CHAR(1) DEFAULT NULL,
  `last_name` VARCHAR(30) DEFAULT NULL,
  `address1` VARCHAR(100) DEFAULT NULL,
  `address2` VARCHAR(100) DEFAULT NULL,
  `address3` VARCHAR(100) DEFAULT NULL,
  `city` VARCHAR(50) DEFAULT NULL,
  `state` CHAR(2) DEFAULT NULL,
  `postal_code` VARCHAR(10) DEFAULT NULL,
  `phone1` VARCHAR(12) DEFAULT NULL,
  `phone2` VARCHAR(12) DEFAULT NULL,
  `phone3` VARCHAR(12) DEFAULT NULL,
  `email` VARCHAR(70) DEFAULT NULL,
  `fax_number` VARCHAR(255) DEFAULT NULL,
  `manager_name` VARCHAR(255) DEFAULT NULL,
  `status` VARCHAR(6) DEFAULT NULL,
  PRIMARY KEY (`lead_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;

Query:

SELECT * FROM phone_directory WHERE phone1 IN ('315XXXXXXX','0315XXXXXXX');

SELECT * FROM phone_directory WHERE phone2 IN ('315XXXXXXX','0315XXXXXXX');

SELECT * FROM phone_directory WHERE phone3 IN ('315XXXXXXX','0315XXXXXXX');

Best Answer

I'd recommend to not have three columns phone1, phone2 and phone3, but have a related table for phones, and have your data fully normalized (or, further normalized):

 CREATE TABLE `phone_directory` (
   `lead_id` INT(9) UNSIGNED NOT NULL AUTO_INCREMENT,
   `list_id` BIGINT(14) UNSIGNED DEFAULT NULL,
   `gmt_offset_now` DECIMAL(4,2) DEFAULT '0.00',
   `first_name` VARCHAR(30) DEFAULT NULL,
   `middle_initial` CHAR(1) DEFAULT NULL,
   `last_name` VARCHAR(30) DEFAULT NULL,
   `address1` VARCHAR(100) DEFAULT NULL,
   `address2` VARCHAR(100) DEFAULT NULL,
   `address3` VARCHAR(100) DEFAULT NULL,
   `city` VARCHAR(50) DEFAULT NULL,
   `state` CHAR(2) DEFAULT NULL,
   `postal_code` VARCHAR(10) DEFAULT NULL,
   `email` VARCHAR(70) DEFAULT NULL,
   `fax_number` VARCHAR(255) DEFAULT NULL,
   `manager_name` VARCHAR(255) DEFAULT NULL,
   `status` VARCHAR(6) DEFAULT NULL,
   PRIMARY KEY (`lead_id`)
 ) ENGINE=INNODB DEFAULT CHARSET=utf8;

 CREATE TABLE phones
 (
     lead_id INT(9) NOT NULL REFERENCES phone_directory(lead_id),
     phone VARCHAR(12) NOT NULL,  
     priority tinyint DEFAULT 1,  -- If you need to give them priorities (1, 2, 3, ...), or sort them
     CONSTRAINT unique_phones UNIQUE(phone), -- You don't want repetead telephones. This enforces it.
     PRIMARY KEY(lead_id, phone)  -- Covering index + clustering... for the sake of efficiency
 ) ;

Your check now is only:

SELECT 
    lead_id, phone 
FROM 
    phones 
WHERE 
    phone IN ('315XXXXXXX','0315XXXXXXX');

On average, this will be 3x faster ... and you can have 0, 1, 2, 3 or any number of phones for a given lead_id, and your UNIQUE constraint enforces uniqueness, so, if there's an error at some point in your application, the database helps you avoid a mistake.

dbfiddle here

NOTE 1: It is not strange to have one company have more than 3 telephone numbers: you're covering all bases.

NOTE 2: As phone numbers aren't going to use non-ASCII characters, you could save some space (and increase slightly in speed, as less data is going to move around) by specifying a single-char collation, such as latin1_bin.

dbfiddle here

Related Solutions

Mysql – Are two indexes needed

An index can seek by a subset of characters, as long as you're searching from the left. E.g., "Inter%" can seek, "%net" will not.

However, the first character is not necessarily the character under which the article would be sorted. "The Internet" should go under "I", not "T". You probably need two fields, DisplayTitle and SortTitle; a single-character index on the latter may be worthwhile, but most likely a full-length index will be just fine.

Indexes are typically B-trees, and a seek will jump to the right location about equally quickly whether you have 10 or 100 entries per page. Scans are another matter, but I'd start with the simplest solution and add an extra index only if performance proves inadequate in practice.

MySQL looking up more rows than needed (indexing issue)

Your indexes are fine for the two types of queries you mentioned.

This query will be satisfied by traversing the clustered index on the primary key...

[...] WHERE participant_id = x AND question_id = y AND given_answer_id = z;

...and this one is satisfied by the index on 'question_id':

[...] WHERE question_id = x;

The output of EXPLAIN SELECT is not telling you what you think it is telling you, because the value shown in rows is an estimate of the number of rows the server will need to consider, not the actual rows it will examine. For InnoDB these are based on index statistics.

rows

The rows column indicates the number of rows MySQL believes it must examine to execute the query.

For InnoDB tables, this number is an estimate, and may not always be exact.

^{— http://dev.mysql.com/doc/refman/5.5/en/explain-output.html#explain_rows}

The optimizer gathers information about different possible query plans, and chooses the one with the lowest cost. The information shown in EXPLAIN is the information the optimizer gathered about the plan it selected.

When type is ref and key is not NULL, this means that the name listed in the key column is the name of the index that the optimizer has chosen to use to find the desired rows, so your query plan looks exactly as it should.

Note, sometimes you will see Using index in the Extra column and a lot of people assume that this means an index is being used, or that no index is being used when that doesn't appear, but that's not correct, either. Using index describes a special case called a "covering index" -- it does not indicate whether an index is being used to locate the rows of interest.

It's possible that running ANALYZE [LOCAL] TABLE would cause the numbers in rows shown by EXPLAIN to differ, but this is a simple query and selecting this index is an obvious choice for the optimizer to make, so ANALYZE TABLE is unlikely to make any actual difference in performance.

It is possible, however, that your overall performance might see some marginal improvement with an occasional OPTIMIZE [LOCAL] TABLE, because you are not inserting rows in primary key order (as would be the case with an auto_increment primary key)... but on large tables this can be time-consuming because it rebuilds a new copy of the table... but, again, I wouldn't expect any significant change.

Best Answer

Related Solutions

Mysql – Are two indexes needed

MySQL looking up more rows than needed (indexing issue)

Related Question