Sql-server – What happens at the data page level when the record length changes

data-pagesdatabase-internalssql serversql-server-2008

I've been digging around the net at work when I can for a few days now trying to make sense of how a DBMS (SQL Server 2008 R2 and others) handles adding a column to the end of a huge table so quickly.

At a high level you could think: I can just put a pointer at the end that points to the new column. However, at the page level aren't data pages filled with individual records? Wouldn't adding a column mean that every page that was already full would require a split?

Even pages that weren't full would require a lot of data juggling to add that column to the end of each record, update all the slot arrays, then cascade all the pointer changes through any existing indexes and/or the IAM and GAM pages?

The only thing I can think of is that all new column data is added to new pages, without the rest of the record, and pointers are added throughout the table tree structure to reference the new column pages. However, this seems like it would ruin spatial locality. If this is it, does the DBMS juggle data behind the scenes even when we don't specifically request a REBUILD?

I'm talking about the bit level of DBMS memory management with pages and asking how DBMSs are able to add a column (with or without allowing NULL values) to a set of existing records so quickly, even though the records already exist as a set of bits in a data page.

Best Answer

how a DBMS (SQL Server 2008 R2 and others) handles adding a column to the end of a huge table so quickly.

Well, there is a false assumption here, that being that adding a new column is always done quickly. This is not a true statement.

Now, when adding a column that allows NULLs, that can be done quickly as the meta-data of the Table definition gets updated but the NULL isn't physically added to the datapages in that moment. SQL Server can return the correct NULL to queries since it is logically obvious what the "value" is. When rows are inserted or updated, the records that are written to the datapages do include the NULL (for fixed-length columns, unless the SPARSE option was used for the new column or Data Compression is enabled on the Clustered Index). But the rest of the rows that have not been updated will not have the NULL physically added until an index REBUILD.

However, when adding a column marked as NOT NULL, then, prior to SQL Server 2012 (and even then, only if the new value was a runtime constant), then the actual value was written physically to the datapages in that moment, and that operation could take a looooooooooong time, depending on how many rows and/or how much data was in the table. You can find plenty of questions and articles about trying to overcome this issue as tables with many GBs of data and/or hundreds of millions of rows could take hours to add a new NOT NULL column.

Then came a truly wonderful new feature in SQL Server 2012 (Enterprise Edition-only, which also implies Developer Edition) whereby adding a new NOT NULL column with a default value could be an instantaneous, meta-data only operation, just like adding a column marked as NULL. The only caveats are that the datatype not be a LOB (e.g. MAX-types, XML, etc) or CLR-based type, and that the value be a runtime constant (i.e. mainly literal values). Something like NEWID() would not be instantaneous since it would need a different value per each row. But for values that are runtime constants, SELECT operations can easily get the correct value by just looking at the meta-data of the DEFAULT which gives the logically-obvious value.

The MSDN page for ALTER TABLE, in the Locks and ALTER TABLE section (under "Adding NOT NULL Columns as an Online Operation"), talks about this behavior.