Is there a SQL Server 2008 equivalent of the USING INDEX clause in Oracle?
No. When you create a primary key or unique constraint in SQL Server, a unique index to support that constraint is created automatically, with the same keys.
Which seems to imply that there is a way of specifying what index should be used for a Primary Key.
No. The documentation is only attempting to explain whether the automatic supporting index will be created as clustered or nonclustered, if you do not specify. It is confusingly worded, I agree.
To clarify, when you add a primary key constraint to an existing table without expressing a preference, the supporting index will be clustered if there is no pre-existing clustered index on the table. The supporting index will be created
as nonclustered if there is already a clustered index
You can specifically request a clustered or nonclustered primary key using: PRIMARY KEY CLUSTERED
or PRIMARY KEY NONCLUSTERED
.
In fairness, the documentation is much clearer on the subject at:
table_constraint (Transact-SQL)
What does Fragmentation Means in a Heap
The fragmentation value in Heap which you get from column avg_fragmentation_in_percent
by querying sys.dm_db_index_physical_stats
DMV states that
Logical fragmentation for indexes, or extent fragmentation for heaps in the IN_ROW_DATA allocation unit.
Further the same BOL says that
This is the percentage of out-of-order extents in the leaf pages of a heap. An out-of-order extent is one for which the extent that contains the current page for a heap is not physically the next extent after the extent that contains the previous page.
So you can see it is not the free space present in pages allocated to Heap but the varying sequence of pages that creates the fragmentation.
This can be demonstrated by small test. Let us create a Heap Table and insert some records in it and then check the fragmentation.
create table dbo.HeapTest
(
Id INT not NULL Default (1),
Col1 char(5000) Not null Default ('Heaps Are Cool')
)
SET NOCOUNT ON
Insert into dbo.Heaptest default values
go 50
select index_type_desc,avg_fragmentation_in_percent,fragment_count,
avg_page_space_used_in_percent,record_count
from sys.dm_db_index_physical_stats(db_id(),object_id('dbo.HeapTest','U'),0,default,'detailed')
So Heap table is created with 50 records in it. Below is what fragmentation looks like after query DMV sys.dm_db_index_physical stats
You can see avg_fragmentation_in_percent
column value is 33 %. Now let us see how are pages arranged. This can be done by using undocumented query %%lockres%%
. The query would be
SELECT %%lockres%%, * FROM dbo.HeapTest;
And below is what output looks like. Attaching only relevant portion of it. The query produced 50 rows since we inserted 50 rows in our dbo.HeapTest table.
What it says is the first page has ID 197
the next page has ID 242
subsequent pages has continuous ID till we reach page ID 264
because after that we get page ID 280
. So this jump in page ID numbers is what actually causing fragmentation.
Now lest rebuild the heap and run The command again to see the fragmentation and how pages are arranged. We get fragmentation like
You can see fragmentation is now 14%
.
Let us see page numbers allocated
We only have one jump rest all pages are allocated page ID serially. Since just one jump fragmentation decreased considerably.
I rebuild the Heap again and now when I checked fragmentation it was completely gone. And page ID allocation is like
Why Fragmentation Increased
Now regarding what could have caused fragmentation to rise we can corroborate it to fact that when pages were getting allocated to the heap they would not be continuous, as you saw above what caused fragmentation value to increase was jump in the PAGE ID's allocated to pages.
At the back of head you should also keep in mind that the word fragmentation for HEAP does not have any meaning, how would you define fragmentation for bunch of un-ordered pages.
Really Worried about Fragmentation
If you really face a scenario where heap table is fragmented and slowing queries it would be better creating a clustered index on table than rebuilding it. The reason is when you rebuild heap all underlying Non Clustered indexes are also rebuilt causing the rebuild process to take much longer time, utilizing lot of resources and bloating transaction log. On a production system one would always try to avoid this. Paul covered this in his Myth Section about heap.
PS: Please don't use undocumented command on production system. This was just for demonstration.
Best Answer
I agree that there is no point in creating an
IDENTITY
column clustered index here but I would just create the composite PK asCLUSTERED
rather thanNONCLUSTERED
.There is no point having 2 copies of the data, One in the heap that never gets used and one in the NCI. In fact it is downright counter productive as shown below.
Your NCI will still get page splits anyway if you are not inserting into the table in order of primary key so you have not avoided the problem. You might also consider a
UNIQUE
constraint or index on the reversed key order as well depending upon what queries you run against that table.In general NCIs can be slightly be more compact than the equivalent NCI as they do not have the status bits B or
NULL_BITMAP
(if no nullable columns) but in this case this is more than outweighed by the necessity to store the 8 byte RID pointer to the row in the heap as can be seen from the below.Script
Results
The record length for the non leaf pages in both the CI and the NCI is 15 bytes (1 byte status bits, 8 bytes for the composite key and 6 for the down page pointer) but for the leaf page NCI the rows take up 17 bytes (1 byte status bits, 8 bytes for the composite key and 8 for the row pointer) compared to 15 for the CI (2 bytes status bits, 1 byte column count offset, 8 bytes data, 2 bytes column count, 1 byte null bitmap). And as well as this less compact index structure you also have all the additional pages for the heap itself on top.