Sql-server – How to speed up query on table with millions of rows

clustered-indexindexperformancequery-performancesql server

The Issue:

I'm working on a big table that consists about 37mln rows. Data include measurements of many devices made in certain time e.g. '2013-09-24 10:45:50'. Each day all of those devices are sending many measurements in different intervals on different times. I want to make a query which selects all the most actual ( 'actual' I mean the latest from all measurements made in each day) measurement of each day for 2 months e.g from 2013-01-01 to 2013-02-01.

The problem is that this query takes so much time to go, despite all of the indexes i've made on different columns. I've also created auxiliary table that contains max(MeterDate) and MeasurementsId when the measurement was given. I've noticed that index can't be made on MeterDate because it contains date and time which is not useful for making an index on it. So i converted the MeterDate -> CONVERT(date, MeterDate). I though that after joining The auxiliary table with [dbo].[Measurements] the query would be faster but still query takes more than 12s which is too long for me.

The structure of table:

Create table [dbo].[Measurements]

[Id] [int] IDENTITY(1,1) NOT NULL,
[ReadType_Id] [int] NOT NULL,
[Device_Id] [int] NULL,
[DeviceInterface] [tinyint] NULL,
[MeterDate] [datetime] NULL,
[MeasureValue] [decimal](18, 3) NULL

Every row of Measurements table include measurement value on direct MeterDate e.g. "2008-04-04 13:28:44.473"

Direct select structure:

DECLARE @startdate datetime= '2013-07-01'; 
DECLARE @enddate datetime = '2013-08-01';

SELECT *
FROM [dbo].[Measurements] 
WHERE [MeterDate] BETWEEN @startdate and @enddate 

Does anyone knows how to rebuilt table or add new or add indexes on which column that speed up query a bit ? Thanks in advance for any info.

Edit:

The table that I used was created by this query

with t1 as
(
    Select  [Device_Id], [DeviceInterface],  CONVERT(date,  MeterDate) as OnlyDate, Max(MeterDate) as MaxMeterDate
    FROM [dbo].[Measurements] 
    GROUP BY [Device_Id], [DeviceInterface], CONVERT(date,  MeterDate)
)
Select t1.[Device_Id], t1.[DeviceInterface],t1.[OnlyDate], r.Id  
INTO [dbo].[MaxDatesMeasurements]
FROM t1
JOIN [dbo].[Measurements] as r ON r.Device_Id = t1.Device_Id AND r.DeviceInterface = t1.DeviceInterface AND r.MeterDate = t1.MaxMeterDate

Then I wanted to join the new created table [dbo].[MaxDatesMeasurements] with old [dbo].[Measurements] and select direct rows

DECLARE @startdate datetime= '2013-07-01'; 
DECLARE @enddate datetime = '2013-08-01'; 


Select *
From [dbo].[MaxDatesMeasurements] as t1 
Join [dbo].[Measurements] as t2 on t1.[Id] = t2.[Id] 
WHERE t1.[OnlyDate] BETWEEN @startdate AND @enddate

Best Answer

On reason this can happen is that you're using local variables.

The problem is that this query takes so much time to go, despite all of the indexes i've made on different columns.

Here's an example using a similar setup. In the Stack Overflow schema there's a narrow-ish table called Votes that looks like this.

NUTS

With no index on CreationDate, our only option would be to scan the Clustered Index. But if we create one only on CreationDate, the optimizer can choose to use that index if it thinks doing a Key Lookup for the rest of the columns is cheaper than scanning the Clustered Index and applying a predicate.

CREATE INDEX ix_yourmom ON dbo.Votes(CreationDate)

For this query:

DECLARE @StartDate DATETIME = '2010-07-01';
DECLARE @EndDate DATETIME = '2010-07-02';

SELECT *
FROM   dbo.Votes AS v
WHERE  v.CreationDate BETWEEN @StartDate AND @EndDate;
GO

NUTS

The cardinality estimate for unknown variables using between is 16.4317%. That leads to a clustered index scan and a missing index request for an index that covers the entire query.

If you run the query with RECOMPILE, you allow for the parameter embedding optimization.

DECLARE @StartDate DATETIME = '2010-07-01';
DECLARE @EndDate DATETIME = '2010-07-02';

SELECT *
FROM   dbo.Votes AS v
WHERE  v.CreationDate BETWEEN @StartDate AND @EndDate
OPTION ( RECOMPILE );

Which gives us a different query plan, and a more accurate estimate.

NUTS

Hope this helps!