Sql-server – Questionable duplicate index suggested by optimization

azure-sql-databaseindexindex-tuningsql server

Good day, I have the following sql server database table:

Please note the compound primary key. This was done for 3 reasons:

Prevent duplicate entries
Improve query performance as all queries will have all of thoes 3 keys.
We needed and index, and I did not want to introduce a random ID.

Please also note that this table was designed with its size in mind, This table is going to store millions and millions of rows of data.

OK now for my actual question. I am using azure sql server to host this db. and I have enabled automatic tuning. And strangely enough I see that it then went and created a new index. (see below)

Now In my mind this seems to be a duplicate index, as the same columns are being indexed.

So I now have two indexes on my table:

Original (My PK):

ALTER TABLE [dbo].[SensorDataRaw] ADD  CONSTRAINT [PK_SensorDataRaw] PRIMARY KEY CLUSTERED 
(
    [DateTime] ASC,
    [SensorId] ASC,
    [Key] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO

Newly Added (Auto Created by azure tuning):

CREATE NONCLUSTERED INDEX [nci_wi_SensorDataRaw_DC9789077DA75B4440AC8BFE3E2AA198] ON [dbo].[SensorDataRaw]
(
    [Key] ASC,
    [SensorId] ASC,
    [DateTime] ASC
)
INCLUDE (   [Value]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO

Observations:

The order of the coloumns has been reversed in the new index.
The new index is NOT unique
the new index includes the value column.

Please note my knowledge on indexes is not advanced, hence me asking this.

So my question is:

Can someone explain why the newly added index is better than my intially created one.
How can I remove the two indexes and just create one that covers both cases. With this being such a massive database I cannot afford the space that both these indexes will take up.
Is the maybe a better design alternative?

Additional Info:

I'm assuming the type of queries becomes important here, So I have listed some examples.

All queries include DateTime, SensorId, and Key.

Simple Queries:

Select SensorId Where average value for key w is greater than x where time between (y,z)

Graphing Data:

SELECT AVG([Value]) AS 'AvgValue',    
    DATEADD( MINUTE, 
    (DATEDIFF(MINUTE, '1990-01-01T00:00:00', [dbo].[SensorDataRaw].
[DateTime]) / @IntervalInMinutes) * @IntervalInMinutes,
        '1990-01-01T00:00:00'
    )      AS 'TimeGroup'
 FROM [dbo].[SensorDataRaw]
 where 
    [dbo].[SensorDataRaw].[SensorId] = @SensorId
    and [dbo].[SensorDataRaw].[Key] = @KeyValue
    and [dbo].[SensorDataRaw].[DateTime] Between @DateFrom and @DateTo
    and [dbo].[SensorDataRaw].[Value] IS NOT NULL  
  GROUP BY (DATEDIFF(MINUTE, '1990-01-01T00:00:00', [dbo].[SensorDataRaw].
    [DateTime]) / @IntervalInMinutes)

Best Answer

The index suggested by the system is a much better fit for the query you have shown. You should aim to have columns with equality predicates as the leading columns.

Consider a phone book ordered by lastname, firstname. If your rquirement is to find all people with surnames between "Brown" and "Yates" and a first name of "John" then you need to read most of the phone book. If the phonebook was instead ordered by firstname, lastname you could easily find the "John" section and the first "Brown" in the section then all you need to do is read all the names until the lastname is after "Yates" or a new firstname is encountered.

It might not be the ideal index. Potentially you should just change the key columns in the clustered index to this order rather than creating a new one though. You need to evaluate this based on knowledge of your workload.

Related Solutions

Sql-server – Is Index causing timeouts

You must get execution plan of each slowly query first, and survay required Index.

for change timeout time use below address in SSMS:

Tools\Options\Designers\Trasnaction time-out after

Default value for timeout is 30s and you change this value to about 6000s and then create own index. The above path is in SQL Server Management Studio 2008 R2

Sql-server – SQL server indexing foreign keys, covering indexes included columns

If a FK does not have a dedicated index on them but are part of wider indexes used for covering queries, Should they have a dedicated index created?

It depends on the table's access patterns. If the column is being searched a lot (and, ideally, is highly selective), then yes, you absolutely should have an index on that column, with the column as the first key column in the definition.

Should I be removing some of these indexes and combining them with included columns instead? then have dedicated indexes for my foreign keys?

What was given in the question is somewhat unclear, and the question you've asked is a bit... confused, so let's take a step back for a second.

In SQL Server 2005+, the three most important parts of an index definition are:

The key columns, which determines the index sort order. This means the order of the key columns is very important, because SQL Server uses an index by searching for a value in the first key column, then in the second key column, etc.
The included columns, which are copies of row data tagged onto the index structure. The order included columns are specified is irrelevant.
Is the index unique? This means that the index key can contain only unique combinations of column values.

(While this is not relevant to the discussion at hand, for completeness I will mention it here: SQL Server 2008+ introduces the concept of filtered indexes, which only includes rows in the index that satisfy a predicate.)

The first thing you should do is index consolidation. This involves using the points above to combine indexes that share commonalities.

For example, consider the following two indexes:

CREATE INDEX IX_1 ON [dbo].[t1](C1) INCLUDE(C3, C4);
CREATE INDEX IX_2 ON [dbo].[t1](C1, C2) INCLUDE(C5);

These indexes share the leading key column, C1. Included columns can be specified in any order, so these two indexes could be combined as follows:

CREATE INDEX IX_3 ON [dbo].[t1](C1, C2) INCLUDE(C3, C4, C5);

Where index keys differ in their composition or other properties, you have to be very careful. Consider these indexes:

CREATE INDEX IX_4 ON [dbo].[t1](C1, C3) INCLUDE(C4);
CREATE UNIQUE INDEX IX_5 ON [dbo].[t1](C1, C4) INCLUDE(C5);

Now the decision is not as easy. You have to determine what to do based on your workload, which queries hit the table, and the selectivity of the data itself.

So to answer the question more directly: if you currently have one or more indexes where the column of interest is the first key column in those indexes, you don't have to add more indexes, because the indexes you have are useful.

If the column is searched frequently and there isn't an index with that column as the first key column, you should create an index with that column as the first key column. (Depending on query requirements, you may want to specify other columns as well, for either the key or the included columns.)

If the column is not searched frequently, you can potentially get away with having it contained in another index (not the first key column): the query may be satisfied by scanning the index that contains the column. This is not as efficient as an index seek (for many reasons), but if this operation doesn't happen too often, and the performance in this case is acceptable, you may be okay.

Remember that creating indexes isn't free -- they take up data space, log space, cache memory, and can potentially slow down INSERT/UPDATE/DELETE activity (having said that, there can be other advantages to creating indexes). It's a balance you have to strike for your environment.

Best Answer

Related Solutions

Sql-server – Is Index causing timeouts

Sql-server – SQL server indexing foreign keys, covering indexes included columns

Related Question