If it is adding a row at the end of the index it will just allocate a new page for the row rather than split the current end page. Experimental evidence for this is below (uses the %%physloc%%
function which requires SQL Server 2008). See also the discussion here.
CREATE TABLE T
(
id int identity(1,1) PRIMARY KEY,
filler char(1000)
)
GO
INSERT INTO T
DEFAULT VALUES
GO 7
GO
SELECT sys.fn_PhysLocFormatter(%%physloc%%)
FROM T
GO
INSERT INTO T
DEFAULT VALUES
GO
SELECT sys.fn_PhysLocFormatter(%%physloc%%)
FROM T
GO
DROP TABLE T
Returns (Your results will vary)
(1:173:0) /*File:Page:Slot*/
(1:173:1)
(1:173:2)
(1:173:3)
(1:173:4)
(1:173:5)
(1:173:6)
(1:110:0) /*Final insert is on a new page*/
This does only appear to apply to leaf nodes though. This can be seen by running the below and adjusting the TOP
value. For me 622/623
was the cut off point between requiring one and two first level pages (might vary if you have snapshot isolation enabled?). It does split the page in a balanced manner leading to wasted space at this level.
USE tempdb;
CREATE TABLE T2
(
id int identity(1,1) PRIMARY KEY CLUSTERED,
filler char(8000)
)
INSERT INTO T2(filler)
SELECT TOP 622 'A'
FROM master..spt_values v1, master..spt_values v2
DECLARE @index_info TABLE
(PageFID VARCHAR(10),
PagePID VARCHAR(10),
IAMFID tinyint,
IAMPID int,
ObjectID int,
IndexID tinyint,
PartitionNumber tinyint,
PartitionID bigint,
iam_chain_type varchar(30),
PageType tinyint,
IndexLevel tinyint,
NextPageFID tinyint,
NextPagePID int,
PrevPageFID tinyint,
PrevPagePID int,
Primary Key (PageFID, PagePID));
INSERT INTO @index_info
EXEC ('DBCC IND ( tempdb, T2, -1)' );
DECLARE @DynSQL nvarchar(max) = 'DBCC TRACEON (3604);'
SELECT @DynSQL = @DynSQL + '
DBCC PAGE(tempdb, ' + PageFID + ', ' + PagePID + ', 3); '
FROM @index_info
WHERE IndexLevel = 1
SET @DynSQL = @DynSQL + '
DBCC TRACEOFF(3604); '
EXEC(@DynSQL)
DROP TABLE T2
If you really have to modify these data rarely, then you can simply store the result of the CTE in a table, and run queries against this table. You can define indexes based on your typical queries.
Then TRUNCATE
and repopulate (and ANALYZE
) as necessary.
On the other hand, if you can put the CTE in separate stored procedures rather than a view, you can easily put your conditions in the CTE part rather then the final SELECT
(which is basically what you do querying against tree_view_1
), so that much less rows will be involved in the recursion. From the query plan it looks like that PostgreSQL estimates row numbers based on some far-from-true assumptions, probably producing suboptimal plans - this effect can be reduced somewhat with the SP solution.
EDIT I may miss something, but just noticed that in the non-recursive term you don't filter the rows. Possibly you want to include only root nodes there (WHERE parent_id IS NULL
) - I'd expect much less rows and recursions this way.
EDIT 2 AS it slowly became clear for me from the comments, I misthought the recursion in the original question going the other way. Here I mean starting from the root nodes and going deeper in the recursion.
Best Answer
In the original description of a B-tree the internal nodes held not just the index keys but all columns of the table. In the B+-tree only the leaf nodes store all columns' values. As each node in the tree is a fixed size, removing non-key data from internal nodes frees space for more keys to be held.
All operators can be made to work with the BTree/ B+Tree structure. Some are more efficient than others so, in practice, the query optimizer may choose not to use the index if one of the inefficient operators is involved.
The best case is equality (=). Here the matching key intervals can be followed from root to leaf and the corresponding row(s) returned. It may even be possible to determine from the root node alone that the query will return an empty set so tree walking can stop at that point.
Range queries (<=, >=) can be efficient. The tree can be walked to a leaf for the given predicate value. If leaf pages are a doubly-linked list the remainder of the range can be found by following these leaf-to-leaf links.
Wildcard searches can use the tree, depending on where the wildcard character is. If it is in the middle of the predicate or at the end then the search can proceed as for a range query, with additional checking of each value read. If the predicate starts with the wildcard the tree cannot be used because a B+Tree is an ordered set and a leading wildcard gives no indication of where in that ordering to start.
An inequality operator (!=) can be treated as two range conditions:
(value < predicate) or (value > predicate)
. If the distribution of values in the data is very skewed it may still be efficient to use the tree. For typical cases it is more likely the optimizer will scan the leaf nodes from start to end.