Sql-server – What happens when I change a table column’s data type to increase its data length in SQL Server

sql server

I believe SQL Server would create a new column under the hood and copy the original data to the newly created column. And the original space for the old column cannot be reclaimed before the table is rebuilt.

But I can't come up with a general picture in my head about what exactly is going on.

Say I have a simple table defintion:

CREATE TABLE ExampleTbl
(
 ID INT
 ,Bool BIT
)

And a certain page is completely filled with ExampleTbl's data rows. What happens if I change Bool column data type to int?

Since the page is full, does it mean the the newly created column will have to be stored in some other data page? I originally thought it should be in the same page because SQL Server will always try to use in-row allocation when the data row fits into a page(less then 8000 bytes).

But then it may involve a lot of moving around of the data rows to allocate space so the new column can be inserted into the data page, and some rows will have to be popped out the page and stored in another one, if SQL Server needs to stick to in-row allocation. But this behavior is obviously detrimental to system performance.

So exactly what happens behind the scenes when I change a column data type to one with a wider domain of values?

Thanks!

Best Answer

Changing a column data type to a wider type will require rows to be moved to other pages once a page is full. This behavior is similar to DML updates of variable length columns to a wider value. This can be observed with this script that uses the undocumented sys.dm_db_database_page_allocations TVF available in SQL Server 2012 and later:

CREATE TABLE ExampleTbl
(
  ID INT
 ,Bool BIT
);
WITH 
    t10 AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(n))
    ,t1000 AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS num FROM t10 AS a CROSS JOIN t10 AS b CROSS JOIN t10 AS c)
INSERT INTO ExampleTbl WITH(TABLOCKX)
SELECT num, 1
FROM t1000
WHERE num <= 578;

--single page
DECLARE 
      @database_id int = DB_ID()
    , @object_id int = OBJECT_ID(N'dbo.ExampleTbl')
    , @allocated_page_file_id int
    , @allocated_page_page_id int;
SELECT *
FROM sys.dm_db_database_page_allocations(@database_id, @object_id, 0, 1, 'DETAILED')
WHERE 
    page_type_desc = 'DATA_PAGE';
GO

ALTER TABLE dbo.ExampleTbl
    ALTER COLUMN Bool INT;
GO

--3 pages, with 2 100% full and last 50% full
DECLARE 
      @database_id int = DB_ID()
    , @object_id int = OBJECT_ID(N'dbo.ExampleTbl')
    , @allocated_page_file_id int
    , @allocated_page_page_id int;
SELECT *
FROM sys.dm_db_database_page_allocations(@database_id, @object_id, 0, 1, 'DETAILED')
WHERE 
    page_type_desc = 'DATA_PAGE';
GO

Heaps are particularly nasty when DML/DDL increases row length because the original row is retained with a forwarding pointer. Consequently, more space will be required that the equivalent change against a table with clustered index.