I used the following setup to run some tests against.
CREATE TABLE GeographyPoint (
ID INTEGER IDENTITY(1,1) NOT NULL PRIMARY KEY,
GeoPoint GEOGRAPHY NOT NULL,
LegendTypeID INTEGER NOT NULL
);
INSERT INTO GeographyPoint (GeoPoint, LegendTypeID)
SELECT TOP 1000000
Geography::Point(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,4326),
CAST(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 25 AS INTEGER)
FROM Tally;
CREATE INDEX GP_IDX1 ON GeographyPoint(LegendTypeID) INCLUDE (ID, GeoPoint);
CREATE SPATIAL INDEX GP_SIDX ON GeographyPoint(GeoPoint) USING GEOGRAPHY_AUTO_GRID;
This gives a table of 1,000,000 random points with a 2 x 2 degree spread.
After trying a few different options on it, the best performance I could get was forcing it to use the spatial index. There was a couple of ways to achieve this. Dropping the index on LegendTypeID or using a hint.
You will need to decide which is best for your situation. Personally I don't like using index hints and would drop the other index if it is not required for other queries.
The queries stacked up against each other
DECLARE @point geography;
SET @point = geography::Point(1,1,4326);
/*
Clustered index scan (PK)
SQL Server Execution Times:
CPU time = 641 ms, elapsed time = 809 ms
*/
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[ID],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint]
WHERE 18 = [LegendTypeId]
ORDER By distance ASC
OPTION(MAXDOP 1)
/*
Index Seek NonClustered (GP_IDX1)
SQL Server Execution Times:
CPU time = 2250 ms, elapsed time = 2806 ms
*/
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[ID],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)
/*
For the next 2 queries
Clustered Index Seek (Spatial)
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 11 ms
*/
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[ID],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint] WITH(INDEX(GP_SIDX))
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)
DROP INDEX GP_IDX1 ON [GeographyPoint]
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[ID],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)
Summarizing some of the main points from our chat room discussion:
Generally speaking, SQL Server caches a single plan for each statement. That plan must be valid for all possible future parameter values.
It is not possible to cache a seek plan for your query, because that plan would not be valid if, for example, @productid is null.
In some future release, SQL Server might support a single plan that dynamically chooses between a scan and a seek, depending on runtime parameter values, but that is not something we have today.
General problem class
Your query is an example of a pattern variously referred to as a "catch all" or "dynamic search" query. There are various solutions, each with their own advantages and disadvantages. In modern versions of SQL Server (2008+), the main options are:
IF
blocks
OPTION (RECOMPILE)
- Dynamic SQL using
sp_executesql
The most comprehensive work on the topic is probably by Erland Sommarskog, which is included in the references at the end of this answer. There is no getting away from the complexities involved, so it is necessary to invest some time in trying each option out to understand the trade-offs in each case.
IF
blocks
To illustrate an IF
block solution for the specific case in the question:
IF @productid IS NOT NULL AND @priceid IS NOT NULL
BEGIN
SELECT
T.productID,
T.priceID
FROM dbo.Transactions AS T
WHERE
T.productID = @productid
AND T.priceID = @priceid;
END;
ELSE IF @productid IS NOT NULL
BEGIN
SELECT
T.productID,
T.priceID
FROM dbo.Transactions AS T
WHERE
T.productID = @productid;
END;
ELSE IF @priceid IS NOT NULL
BEGIN
SELECT
T.productID,
T.priceID
FROM dbo.Transactions AS T
WHERE
T.priceID = @priceid;
END;
ELSE
BEGIN
SELECT
T.productID,
T.priceID
FROM dbo.Transactions AS T;
END;
This contains a separate statement for the four possible null-or-not-null cases for each of the two parameters (or local variables), so there are four plans.
There is a potential problem there with parameter sniffing, which might require an OPTIMIZE FOR
hint on each query. Please see the references section to explore these types of subtleties.
Recompile
As noted above an in the question, you could also add an OPTION (RECOMPILE)
hint to get a fresh plan (seek or scan) on each invocation. Given the relatively slow frequency of calls in your case (once every ten seconds on average, with a sub-millisecond compilation time) it seems likely this option will be suitable for you:
SELECT
T.productID,
T.priceID
FROM dbo.Transactions AS T
WHERE
(T.productID = @productid OR @productid IS NULL)
AND (T.priceID = @priceid OR @priceid IS NULL)
OPTION (RECOMPILE);
It is also possible to combine features from the above options in creative ways, to make the most of the advantages of each method, while minimizing the downsides. There really is no shortcut to understanding this stuff in detail, then making an informed choice backed by realistic testing.
Further reading
Best Answer
Yes, that is safe, meaning that it cannot damage your database and that the index will be correct once the statement succeeds.
Creating an index uses I/O and CPU resources, that is unavoidable.
If
CREATE INDEX CONCURRENTLY
fails, it will leave behind an invalid index that you should drop, since it uses space.CREATE INDEX CONCURRENTLY
will require anACCESS EXCLUSIVE
lock on the table, but differently from a normalCREATE INDEX
which locks the table for the whole duration of the operation, it takes only a very short lock at the very end. This is normally not a problem, but it can be a problem if you have long running transactions that use the table:CREATE INDEX CONCURRENTLY
will have to queue behind these transactions, and all later statements will have to queue behind it.So make sure you have no long running transactions.