In general it is not possible to use an index seek on a condition x <> 1
and y <> 1
.
With an index on x,y
the best you can do is convert it into two range seeks (x < 1
and x > 1
) with a residual predicate on y <> 1
(and this wouldn't be able to use additional index key columns to avoid a sort)
For a bit
column as it can only have three values. 0
, 1
, NULL
logically WHERE bit_column <> 1
is equivalent to WHERE bit_column = 0
but seems SQL Server doesn't take advantage of that here and convert the <>
to =
conditions for you.
Adding a couple of check constraints does the job though even though these are apparently redundant in that they don't actually restrict the allowable values for the datatype in any way (for NULL
if a check constraint evaluates to UNKNOWN
it counts as passing)
CREATE TABLE MyTable
(
Foo INT,
IsFlag1 BIT NULL CHECK (IsFlag1 IN (0, 1)),
IsFlag2 BIT NULL CHECK (IsFlag2 IN (0, 1)),
SomeId INT
);
CREATE NONCLUSTERED INDEX ix
ON MyTable(IsFlag1, IsFlag2, SomeId)
INCLUDE (Foo);
The plan now does show a seek on IsFlag1 = 0 AND IsFlag2 = 0
Or alternatively this filtered index also avoids the need for a SORT
CREATE NONCLUSTERED INDEX ix
ON MyTable(SomeId)
INCLUDE (Foo,IsFlag1, IsFlag2)
WHERE IsFlag1 != 1 and IsFlag2 != 1
It does a scan of the filtered index (the qualifying rows ordered by SomeId
) with a TOP
to stop scanning after the 1,000 rows are retrieved. IsFlag1, IsFlag2
are INCLUDE
-d in the index to avoid an unnecessary look up that occurs without this.
I took the liberty of editing your SQL so the clauses are on separate lines. I violated CodeReview group discipline by doing that. Sorry! My suggestion to you is to format your queries so the various clauses are clearly visible, as I did. This kind of formatting, in my experience, makes it easier to understand the logic of queries and to spot errors.
You've done a good job declaring your columns NOT NULL
where possible. That helps performance a lot.
Your first query seems to be illogical, as it combines an aggregate function with a DISTINCT
qualifier. MySQL allows a lot of sloppiness in aggregate queries unfortunately.
It looks like you're looking for a resultset with a row for each location that shows the the largest replication_number at that location. Do you want this query instead?
select location_code,
Max(replication_number) as replication
from hybrid_exp_trait_dtl t
where t.experiment_id='NSKOK12K1102' and t.trial_status='MLT'
group by location_code
Your Index_4
will allow this query to be satisfied using a so-called loose index scan, which is really very fast.
Your second query will most likely be a lot faster if you change your fourth compound key (the one you've named hbrid
) to have the following order.
(EXPERIMENT_ID,TRAIT_NAME,HYBRID_NUMBER)
That is because your query looks for exact values of EXPERIMENT_ID
and TRAIT_NAME
. When those two columns come first in the index, the query can jump to the exact position in the index and then sequentially reel off your results.
Now, you may have some other query that needs to have HYBRID_NUMBER
first, but you didn't mention it in your question.
You asked what effect these indexes will have on load performance. It's hard to answer that question exactly without knowing a lot about your load process.
You don't have any indexes that call for uniqueness. That is very good for loading performance.
But, there are a few things you can do to speed up loading. They are detailed here. The most important speed trick is probably to avoid autocommit. Start your import process with this SQL command.
SET autocommit=0;
Then do a few hundred INSERT
or UPDATE
statements in a row, then do
COMMIT;
Don't try to do too many INSERT
or UPDATE
statements before COMMIT
or you'll use up a lot of RAM with transaction buffers.
You can also turn of foreign key checks while loading. See the web page mentioned above. But be careful. If your loaded data violates foreign key constraints, when you try to turn checking back on you'll have a mess.
When you're done with a load operation that changes a large fraction of the rows, do
OPTIMIZE LOCAL TABLE hybrid_exp_trait_dtl;
Your table will be unavailable while this OPTIMIZE
operation runs: it rebuilds the table and indexes.
Best Answer
The order of the index matters.
Your update query uses
seven
andeight
. There must be at the left most begining of the index. Because both are with= value
based where criteria the order doesn't matter.Putting the updated columns into the index doesn't help an update query. If it was a
select
query, then the query would return those columns from the index.See how mysql uses indexes.
Welcome to DBA stack exchange. Please include the
explain
output next time along withshow create table {tablename}
as it conveys the message faster.