The myth goes back to before SQL Server 6.5, which added row level locking. And hinted at here by Kalen Delaney.
It was to do with "hot spots" of data page usage and the fact that a whole 2k page (SQL Server 7 and higher use 8k pages) was locked, rather then an inserted row
Edit, Feb 2012
Found authoritative article by Kimberly L. Tripp
"The Clustered Index Debate Continues..."
Hotspots were something that we greatly tried to avoid PRIOR to SQL Server 7.0 because of page level locking (and this is where the term hot spot became a negative term). In fact, it doesn't have to be a negative term. However, since the storage engine was rearchitected/redesigned (in SQL Server 7.0) and now includes true row level locking, this motivation (to avoid hotspots) is no longer there.
Edit, May 2013
The link in lucky7_2000's answer seems to say that hotspots can exist and they cause issues.
However, the article uses a non-unique clustered index on TranTime. This requires a uniquifier to be added. Which means the index in not strictly monotonically increasing (and too wide). The link in that answer does not contradict this answer or my links
On a personal level, I have woked on databases where I inserted tens of thousands of rows per second into a table that has a bigint IDENTITY column as the clustered PK.
I think in a million records query, you have to avoid things like OUTER JOINS
. I suggest you use UNION ALL
Instead of LEFT JOIN
.
As long as I think CROSS APPLY
is more efficient than sub-query in the select clause I will modify the query written by Conard Frix, which I think is correct.
now: when I started to modify your query I noticed that you have a WHERE clause saying: JoinedTable.WhereColumn IN (1, 3)
. in this case, if the field is null the condition will become false. then why are you using LEFT JOIN while you are filtering null valued rows?
just replace LEFT JOIN
With INNER JOIN
, I guarantee that it will become faster.
about INDEX:
please note that when you have an index on a table, say
table1(a int, b nvarchar)
and your index is :
nonclustered index ix1 on table1(a)
and you want to do something like this:
select a,b from table1
where a < 10
in your index you have not included the column b
so what happens?
if sql-server uses your index, it will have to search in the index, called "Index Seek" and then refer to main table to get column b
, called "Look Up". This procedure might take much longer than scanning the table itself: "Table Scan".
but based on the statistics that sql-server has, in such situations, it might not use your index at all.
so first of all check the Execution Plan
to see if the index is used at all.
if yes or no both, alter your index to include all columns that you are selecting. say like:
nonclustered index ix1 on table1(a) include(b)
in this case Look Up will not be needed, and your query will execute so much faster.
Best Answer
If you post schema and query etc, that would be useful, in addition to the other 3 answers.
Some links anyway, from Simple-Talk. Good stuff.