Based on these questions and the answers given:
SQL 2008 Server – performance loss possibly connected with a very large table
I have a table in a database SupervisionP defined like this:
CREATE TABLE [dbo].[PenData](
[IDUkazatel] [smallint] NOT NULL,
[Cas] [datetime2](0) NOT NULL,
[Hodnota] [real] NULL,
[HodnotaMax] [real] NULL,
[HodnotaMin] [real] NULL,
CONSTRAINT [PK_Data] PRIMARY KEY CLUSTERED
(
[IDUkazatel] ASC,
[Cas] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
ALTER TABLE [dbo].[PenData] WITH NOCHECK ADD CONSTRAINT [FK_Data_Ukazatel] FOREIGN KEY([IDUkazatel])
REFERENCES [dbo].[Ukazatel] ([IDUkazatel])
ALTER TABLE [dbo].[PenData] CHECK CONSTRAINT [FK_Data_Ukazatel]
It contains cca 211 milion rows.
I run following statement:
DECLARE @t1 DATETIME;
DECLARE @t2 DATETIME;
SET @t1 = GETDATE();
SELECT min(cas) from PenData p WHERE IDUkazatel=24
SELECT min(cas) from PenData p WHERE IDUkazatel=25
SET @t2 = GETDATE();
SELECT DATEDIFF(millisecond,@t1,@t2) AS elapsed_ms;
SET @t1 = GETDATE();
SELECT min(cas) from PenData p WHERE IDUkazatel=24 OR IDUkazatel=25
SET @t2 = GETDATE();
SELECT DATEDIFF(millisecond,@t1,@t2) AS elapsed_ms;
The result is shown here:
The third SELECT also loads much more data into SQL Server memory cache.
Why is the third SELECT so much slower (8.5 s) then the first two SELECTs(16 ms)?
How can I improve performance of the third select with OR? I want to run following SQL command but it seems to me that creating cursor and running separate queries is much faster than a single select in this case.
SELECT MIN(cas) from PenData p WHERE IDUkazatel IN (SELECT IDUkazatel FROM ...)
EDIT
As David suggested I have hovered over the fat arrow:
Best Answer
For the first two queries all it has to do is scan in the clustered index to the first entry for that value of
IDUkazatel
- because of the order of the index that row will be the lowest value for cas for that value ofIDUkazatel
.In the second query this optimisation is not value and it is probably seeking to the first row for
IDUkazatel=24
then scanning down the index until the last row withIDUkazatel=25
to find the minimum value ofcas
over all those rows.If you hover over that fat arrow you'll see it is reading many rows (certainly all those for 24, probably all those for 25 too), whereas the thin arrows in the plan output for the other two show the
top
action causing it to only consider one row.You could try run each query and then get the minimum for the minimums found:
That said, it seems you have a table with
IDUkazatel
values rather than an explicitOR
clause. The code below will work with that arrangement, simply replace the table name@T
with the name of the table containingIDUkazatel
values:In an ideal world, the SQL Server query optimizer would perform this rewrite for you, but it does not always consider this option today.