Sql-server – Is ALTER TABLE … DROP COLUMN really a metadata only operation

alter-tablesql servertransaction-log

I've found several sources that state ALTER TABLE … DROP COLUMN is a meta-data only operation.

How can this be? Does the data during a DROP COLUMN not need to be purged from the underlying non-clustered indexes and clustered index / heap?

In addition, why do the Microsoft Docs imply that it is a fully logged operation?

The modifications made to the table are logged and fully recoverable. Changes that affect all the rows in large tables, such as dropping a column or, on some editions of SQL Server, adding a NOT NULL column with a default value, can take a long time to complete and generate many log records. Run these ALTER TABLE statements with the same care as any INSERT, UPDATE, or DELETE statement that affects many rows.

As a secondary question: how does the engine keep track of dropped columns if the data isn't removed from the underlying pages?

Best Answer

There are certain circumstances where dropping a column can be a meta-data-only operation. The column definitions for any given table are not included in each and every page where rows are stored, column definitions are only stored in the database metadata, including sys.sysrowsets, sys.sysrscols, etc.

When dropping a column that is not referenced by any other object, the storage engine simply marks the column definition as no longer present by deleting the pertinent details from various system tables. The action of deleting the meta-data invalidates the procedure cache, necessitating a recompile whenever a query subsequently references that table. Since the recompile only returns columns that currently exist in the table, the column details for the dropped column are never even asked for; the storage engine skips the bytes stored in each page for that column, as if the column no longer exists.

When a subsequent DML operation occurs against the table, the pages that are affected are re-written without the data for the dropped column. If you rebuild a clustered index or a heap, all the bytes for the dropped column are naturally not written back to the page on disk. This effectively spreads the load of dropping the column over time, making it less noticeable.

There are circumstances where you cannot drop a column, such as when the column is included in an index, or when you've manually created a statistics object for the column. I wrote a blog post showing the error that is presented when attempting to alter a column with a manually created statistics object. The same semantics apply when dropping a column - if the column is referenced by any other object, it cannot simply be dropped. The referencing object must be altered first, then the column can be dropped.

This is fairly easy to show by looking at the contents of the transaction log after dropping a column. The code below creates a table with a single 8,000 long char column. It adds a row, then drops it, and displays the contents of the transaction log applicable to the drop operation. The log records show modifications to various system tables where the table and column definitions are stored. If the column data was actually being deleted from the pages allocated to the table, you'd see log records recording the actual page data; there are no such records.

DROP TABLE IF EXISTS dbo.DropColumnTest;
GO
CREATE TABLE dbo.DropColumnTest
(
    rid int NOT NULL
        CONSTRAINT DropColumnTest_pkc
        PRIMARY KEY CLUSTERED
    , someCol varchar(8000) NOT NULL
);

INSERT INTO dbo.DropColumnTest (rid, someCol)
SELECT 1, REPLICATE('Z', 8000);
GO

DECLARE @startLSN nvarchar(25);

SELECT TOP(1) @startLSN = dl.[Current LSN]
FROM sys.fn_dblog(NULL, NULL) dl
ORDER BY dl.[Current LSN] DESC;

DECLARE @a int = CONVERT(varbinary(8), '0x' + CONVERT(varchar(10),      LEFT(@startLSN, 8), 0), 1)
      , @b int = CONVERT(varbinary(8), '0x' + CONVERT(varchar(10), SUBSTRING(@startLSN, 10, 8), 0), 1)
      , @c int = CONVERT(varbinary(8), '0x' + CONVERT(varchar(10),     RIGHT(@startLSN, 4), 0), 1);

SELECT @startLSN = CONVERT(varchar(8), @a, 1) 
    + ':' + CONVERT(varchar(8), @b, 1) 
    + ':' + CONVERT(varchar(8), @c, 1)

ALTER TABLE dbo.DropColumnTest DROP COLUMN someCol;

SELECT *
FROM sys.fn_dblog(@startLSN, NULL)


--modify an existing data row 
SELECT TOP(1) @startLSN = dl.[Current LSN]
FROM sys.fn_dblog(NULL, NULL) dl
ORDER BY dl.[Current LSN] DESC;

SET @a = CONVERT(varbinary(8), '0x' + CONVERT(varchar(10),      LEFT(@startLSN, 8), 0), 1);
SET @b = CONVERT(varbinary(8), '0x' + CONVERT(varchar(10), SUBSTRING(@startLSN, 10, 8), 0), 1);
SET @c = CONVERT(varbinary(8), '0x' + CONVERT(varchar(10),     RIGHT(@startLSN, 4), 0), 1);

SELECT @startLSN = CONVERT(varchar(8), @a, 1) 
    + ':' + CONVERT(varchar(8), @b, 1) 
    + ':' + CONVERT(varchar(8), @c, 1)

UPDATE dbo.DropColumnTest SET rid = 2;

SELECT *
FROM sys.fn_dblog(@startLSN, NULL)

(The output is too big to show here, and dbfiddle.uk won't allow me to access fn_dblog)

The first set of output shows the log as a result of the DDL statement dropping the column. The second set of output shows the log after running the DML statement where we update the rid column. In the second result set, we see log records indicating a delete against dbo.DropColumnTest, followed by an insert into dbo.DropColumnTest. Each Log Record Length is 8116, indicating the actual page was updated.

As you can see from the output of the fn_dblog command in the test above, the entire operation is fully logged. This goes for simple recovery, as well as full recovery. The terminology "fully logged" maybe misinterpreted as the data modification is not logged. This is not what happens - the modification is logged, and can be fully rolled back. The log is simply only recording the pages that were touched, and since none of the table's data-pages were logged by the DDL operation, both the DROP COLUMN, and any rollback that might occur will happen extremely quickly, regardless of the size of the table.

For science, the following code will dump the data pages for the table included in the code above, using DBCC PAGE, style "3". Style "3" indicates we want the page header plus detailed per-row interpretation. The code uses a cursor to display the details for every page in the table, so you may want to make sure you don't run this on a large table.

DBCC TRACEON(3604); --directs out from DBCC commands to the console, instead of the error log
DECLARE @dbid int = DB_ID();
DECLARE @fileid int;
DECLARE @pageid int;
DECLARE cur CURSOR LOCAL FORWARD_ONLY STATIC READ_ONLY
FOR
SELECT dpa.allocated_page_file_id
    , dpa.allocated_page_page_id
FROM sys.schemas s  
    INNER JOIN sys.objects o ON o.schema_id = s.schema_id
CROSS APPLY sys.dm_db_database_page_allocations(DB_ID(), o.object_id, NULL, NULL, 'DETAILED') dpa
WHERE o.name = N'DropColumnTest'
    AND s.name = N'dbo'
    AND dpa.page_type_desc = N'DATA_PAGE';
OPEN cur;
FETCH NEXT FROM cur INTO @fileid, @pageid;
WHILE @@FETCH_STATUS = 0
BEGIN
    DBCC PAGE (@dbid, @fileid, @pageid, 3);
    FETCH NEXT FROM cur INTO @fileid, @pageid;
END
CLOSE cur;
DEALLOCATE cur;
DBCC TRACEOFF(3604);

Looking at the output for the first page from my demo (after the column is dropped, but before the column is updated), I see this:

PAGE: (1:100104)


BUFFER:


BUF @0x0000021793E42040

bpage = 0x000002175A7A0000          bhash = 0x0000000000000000          bpageno = (1:100104)
bdbid = 10                          breferences = 1                     bcputicks = 0
bsampleCount = 0                    bUse1 = 13760                       bstat = 0x10b
blog = 0x212121cc                   bnext = 0x0000000000000000          bDirtyContext = 0x000002175004B640
bstat2 = 0x0                        

PAGE HEADER:


Page @0x000002175A7A0000

m_pageId = (1:100104)               m_headerVersion = 1                 m_type = 1
m_typeFlagBits = 0x0                m_level = 0                         m_flagBits = 0xc000
m_objId (AllocUnitId.idObj) = 300   m_indexId (AllocUnitId.idInd) = 256 
Metadata: AllocUnitId = 72057594057588736                                
Metadata: PartitionId = 72057594051756032                                Metadata: IndexId = 1
Metadata: ObjectId = 174623665      m_prevPage = (0:0)                  m_nextPage = (0:0)
pminlen = 8                         m_slotCnt = 1                       m_freeCnt = 79
m_freeData = 8111                   m_reservedCnt = 0                   m_lsn = (616:14191:25)
m_xactReserved = 0                  m_xdesId = (0:0)                    m_ghostRecCnt = 0
m_tornBits = 0                      DB Frag ID = 1                      

Allocation Status

GAM (1:2) = ALLOCATED               SGAM (1:3) = NOT ALLOCATED          
PFS (1:97056) = 0x40 ALLOCATED   0_PCT_FULL                              DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED           

Slot 0 Offset 0x60 Length 8015

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS
Record Size = 8015                  
Memory Dump @0x000000B75227A060

0000000000000000:   30000800 01000000 02000001 004f1f5a 5a5a5a5a  0............O.ZZZZZ
0000000000000014:   5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a  ZZZZZZZZZZZZZZZZZZZZ
.
.
.
0000000000001F2C:   5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a  ZZZZZZZZZZZZZZZZZZZZ
0000000000001F40:   5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a             ZZZZZZZZZZZZZZZ

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

rid = 1                             

Slot 0 Column 67108865 Offset 0xf Length 0 Length (physical) 8000

DROPPED = NULL                      

Slot 0 Offset 0x0 Length 0 Length (physical) 0

KeyHashValue = (8194443284a0)

I've removed most of the raw page dump from the output shown above for brevity. At the end of the output, you'll see this for the rid column:

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

rid = 1

The last line above, rid = 1, returns the name of the column, and the current value stored in the column on the page.

Next, you'll see this:

Slot 0 Column 67108865 Offset 0xf Length 0 Length (physical) 8000

DROPPED = NULL

The output shows that Slot 0 contains a deleted column, by virtue of the DELETED text where the column name would normally be. The value of the column is returned as NULL since the column has been deleted. However, as you can see in the raw data, the 8,000 character long value, REPLICATE('Z', 8000), for that column still exists on the page. This is a sample of that part of the DBCC PAGE output:

0000000000001EDC:   5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a  ZZZZZZZZZZZZZZZZZZZZ
0000000000001EF0:   5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a  ZZZZZZZZZZZZZZZZZZZZ
0000000000001F04:   5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a  ZZZZZZZZZZZZZZZZZZZZ
0000000000001F18:   5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a 5a5a5a5a  ZZZZZZZZZZZZZZZZZZZZ

Best Answer

Related Solutions

Alter table modify column on a huge table containing more than 25 million records

Running out of Transaction Log space during Alter Table

Related Question