Mysql – Why is this query slow, it seems simple

indexMySQLvarchar

I have a very typical table of user information for my application, with an average set of columns (ints, varchars, etc).

In this table are two varchar(100) fields: emailAddress and emailAddress2.

I'm extremely perplexed at why one of my queries is so slow (5+ seconds) and in testing some alternates they are so fast (<= 0.02 seconds).

This query is fast (matching on emailAddress field):

SELECT u.userId
FROM user AS u
WHERE u.isActive=1
AND u.emailAddress = 'user@domain.com'

This query is also fast (matching on emailAddress2 field):

SELECT u.userId
FROM user AS u
WHERE u.isActive=1
AND u.emailAddress2 = 'user@domain.com'

This query is NOT fast (matching on either emailAddress or emailAddress2 field):

SELECT u.userId
FROM user AS u
WHERE u.isActive=1
AND (u.emailAddress = 'user@domain.com' OR u.emailAddress2 = 'user@domain.com')

I have 2 indexes, each on the emailAddress and emailAddress2 field.

(EDIT) Here is the EXPLAIN output:

id                   1
select_type          SIMPLE
table                u
type                 ref
possible_keys        active,emailAddress,emailAddress2
key                  active
key_len              1
ref                  const
rows                 166808
Extra                Using where

Any idea why the 3rd query above is so slow? Suggestions for doing it another way? I need to match on a handful of fields and then get all users EXCEPT ones where their email address is in emailAddress or emailAddress2 (the queries above are simplified for my post).

This is driving me crazy! (and negatively affecting my app's performance).

Thanks

Best Answer

The query is slow because first, this query has an OR condition which cannot be optimized with a single index (while conditions with only AND can) and second, because besides the emails, it has to check the isActive=1 condition. So it prefers to do a table scan. The other possible path would be to use both email indexes with the Index Merge Union Access Algorithm but for some reason the optimizer does not choose this plan.

One solution would be to alter the two indexes you have, appending the isActive column at the end:

ALTER TABLE user 
  DROP INDEX the_name_of_the existing_index,
  DROP INDEX the_name_of_the_other_index,
  ADD INDEX emailAddress_isActive_IX
    (emailAddress, isActive),
  ADD INDEX emailAddress2_isActive_IX
    (emailAddress2, isActive) ;

If it still slow (and/or does not do an index merge but prefers the table scan), you could try to rewrite the query with UNION:

SELECT u.userId
FROM user AS u
WHERE u.isActive=1
AND u.emailAddress = 'user@domain.com'

UNION 

SELECT u.userId
FROM user AS u
WHERE u.isActive=1
AND u.emailAddress2 = 'user@domain.com' ;

Another option - without necessarily adding/altering indexes - is to nest the query into 2 layers, keeping only the email conditions in the internal layer, essentially forcing MySQL to consider using index-merge with the two available indexes:

SELECT userId 
FROM 
    ( SELECT userId, isActive 
      FROM user AS u 
      WHERE emailAddress = 'user@domain.com' 
         OR emailAddress2 = 'user@domain.com'
    ) x 
WHERE isActive=1 ;