is there any effective way to tune the query below?
I have a filtered index on OrganisationID <> 0
Primary key is on another field, not relevant for this query ItemCode varchar(20).
I have tagged sql server 2005 because it needs to work there, but if there is anything else in any other sql server version, please bring it up.
--------------------------------------
-- START CLEAR
--------------------------------------
begin try
drop table #T
end try
begin catch
end catch
--------------------------------------
-- THE TABLE
--------------------------------------
CREATE TABLE #T
(
a1 int,
a2 int,
ID INT,
OrganisationID INT,
Distance INT,
constraint PKt1 PRIMARY KEY CLUSTERED (A1,A2,ID)
)
--------------------------------------
-- ADDING SOME DATA
--------------------------------------
INSERT INTO #T
SELECT 1,1,0,10,100
UNION ALL
SELECT 1,1,1,10,200
UNION ALL
SELECT 1,1,3,10,50
UNION ALL
SELECT 1,1,4,20,80
UNION ALL
SELECT 1,2,5,20,300
UNION ALL
SELECT 1,2,6,0,100
UNION ALL
SELECT 1,3,7,0,100
UNION ALL
SELECT 1,3,4,10,100
GO
--------------------------------------
-- INDEX CREATION
--------------------------------------
create index idx_T_OrganisationID
on #T (id)
WHERE OrganisationID <> 0
GO
--------------------------------------
-- THE QUERY
--------------------------------------
--SET STATISTICS IO ON
--SET STATISTICS TIME ON
SELECT
PID.ID,
PID.OrganisationID
FROM #t AS PID
WHERE id IN (
SELECT MIN(id) FROM dbo.#t
WHERE OrganisationID <> 0
GROUP BY OrganisationID
)
After Comprehensive Testing – The Results
out of these 3 queries:
-- query 1
SELECT
PID.ID,
PID.OrganisationID
FROM #t AS PID
WHERE id IN (
SELECT MIN(id) FROM dbo.#t
WHERE OrganisationID <> 0
GROUP BY OrganisationID
)
-- query 2
SELECT rn.ID, --... other columns go here
rn.OrganisationID
FROM (
SELECT *, n = ROW_NUMBER() OVER(PARTITION BY OrganisationID ORDER BY id)
FROM #t
) rn WHERE n= 1
AND OrganisationID <> 0
-- query 3
SELECT OrganisationID, MIN(ID)
FROM #t T
WHERE OrganisationID <> 0
GROUP BY OrganisationID ;
the first one does not even bring the most correct results,
the query 3 (as suggested in the comments by spaghettidba and ypercubeᵀᴹ ) is the one with best performance in my live environment, with my real table and data, as you can see below, the query I had originally and the one based on query 3:
for this exercise in particular, using the table #T query 2 and query 3 perfom more of less equally as per the query plan below (picture):
Best Answer
This query usually perform better: