Mysql – covering index thestery

indexMySQL

So has anyone else applied a covering index in MySQL and gotten improved performance but when you run explain, it shows using where, using temporary under the extra column? Any thoughts why this is the case? I'm happy about the performance improvement but the engineer in me wants to know why the extra column is not saying "using index". If you need me to provide details, I will but this question is directed to the experts who have already experienced this with MySQL and know why. I would add that in my particular case, I applied a covering index to an update statement. All the columns in the query were included in the composite (covering) index. Here is an example of exactly what I did:

Table A has the following properties:
25,000 rows.
Has the following columns: one, two, three, four, five, six, seven, eight, and nine.

My covering index query target:

update table A set one = 'somevalue', two = 'somevalue2', four = 'somevalue3', five = 'somevalue4'
where seven = 'somevalue5' and eight = 'somevalue6'

covering index on table A:
idx_covering(seven,eight,
one,two,four,five)

Best Answer

The order of the index matters.

Your update query uses seven and eight. There must be at the left most begining of the index. Because both are with = value based where criteria the order doesn't matter.

create index udx_78 on A (seven, eigth)

Putting the updated columns into the index doesn't help an update query. If it was aselect query, then the query would return those columns from the index.

See how mysql uses indexes.

Welcome to DBA stack exchange. Please include the explain output next time along with show create table {tablename} as it conveys the message faster.

Related Solutions

Sql-server – Proper Index for WHERE X <> 1 AND Y <> 1 ORDER BY Z

In general it is not possible to use an index seek on a condition x <> 1 and y <> 1.

With an index on x,y the best you can do is convert it into two range seeks (x < 1 and x > 1) with a residual predicate on y <> 1 (and this wouldn't be able to use additional index key columns to avoid a sort)

For a bit column as it can only have three values. 0, 1, NULL logically WHERE bit_column <> 1 is equivalent to WHERE bit_column = 0 but seems SQL Server doesn't take advantage of that here and convert the <> to = conditions for you.

Adding a couple of check constraints does the job though even though these are apparently redundant in that they don't actually restrict the allowable values for the datatype in any way (for NULL if a check constraint evaluates to UNKNOWN it counts as passing)

CREATE TABLE MyTable
  (
     Foo     INT,
     IsFlag1 BIT NULL CHECK (IsFlag1 IN (0, 1)),
     IsFlag2 BIT NULL CHECK (IsFlag2 IN (0, 1)),
     SomeId  INT
  );

CREATE NONCLUSTERED INDEX ix
  ON MyTable(IsFlag1, IsFlag2, SomeId)
  INCLUDE (Foo);

The plan now does show a seek on IsFlag1 = 0 AND IsFlag2 = 0

Or alternatively this filtered index also avoids the need for a SORT

CREATE NONCLUSTERED INDEX ix 
           ON MyTable(SomeId) 
           INCLUDE (Foo,IsFlag1, IsFlag2) 
           WHERE IsFlag1 != 1 and IsFlag2 != 1

It does a scan of the filtered index (the qualifying rows ordered by SomeId) with a TOP to stop scanning after the 1,000 rows are retrieved. IsFlag1, IsFlag2 are INCLUDE-d in the index to avoid an unnecessary look up that occurs without this.

Mysql – sql indexing and performance

I took the liberty of editing your SQL so the clauses are on separate lines. I violated CodeReview group discipline by doing that. Sorry! My suggestion to you is to format your queries so the various clauses are clearly visible, as I did. This kind of formatting, in my experience, makes it easier to understand the logic of queries and to spot errors.

You've done a good job declaring your columns NOT NULL where possible. That helps performance a lot.

Your first query seems to be illogical, as it combines an aggregate function with a DISTINCT qualifier. MySQL allows a lot of sloppiness in aggregate queries unfortunately.

It looks like you're looking for a resultset with a row for each location that shows the the largest replication_number at that location. Do you want this query instead?

select  location_code, 
        Max(replication_number) as replication
   from hybrid_exp_trait_dtl  t
  where t.experiment_id='NSKOK12K1102' and t.trial_status='MLT'
  group by location_code

Your Index_4 will allow this query to be satisfied using a so-called loose index scan, which is really very fast.

Your second query will most likely be a lot faster if you change your fourth compound key (the one you've named hbrid) to have the following order.

(EXPERIMENT_ID,TRAIT_NAME,HYBRID_NUMBER)

That is because your query looks for exact values of EXPERIMENT_ID and TRAIT_NAME. When those two columns come first in the index, the query can jump to the exact position in the index and then sequentially reel off your results.

Now, you may have some other query that needs to have HYBRID_NUMBER first, but you didn't mention it in your question.

You asked what effect these indexes will have on load performance. It's hard to answer that question exactly without knowing a lot about your load process.

You don't have any indexes that call for uniqueness. That is very good for loading performance.

But, there are a few things you can do to speed up loading. They are detailed here. The most important speed trick is probably to avoid autocommit. Start your import process with this SQL command.

 SET autocommit=0;

Then do a few hundred INSERT or UPDATE statements in a row, then do

 COMMIT;

Don't try to do too many INSERT or UPDATE statements before COMMIT or you'll use up a lot of RAM with transaction buffers.

You can also turn of foreign key checks while loading. See the web page mentioned above. But be careful. If your loaded data violates foreign key constraints, when you try to turn checking back on you'll have a mess.

When you're done with a load operation that changes a large fraction of the rows, do

 OPTIMIZE LOCAL TABLE hybrid_exp_trait_dtl;

Your table will be unavailable while this OPTIMIZE operation runs: it rebuilds the table and indexes.

Best Answer

Related Solutions

Sql-server – Proper Index for WHERE X <> 1 AND Y <> 1 ORDER BY Z

Mysql – sql indexing and performance

Related Question