Sql-server – Fast alter column NVARCHAR(4000) to NVARCHAR(260)

alter-tablesql serversql-server-2017

I have a performance problem with very large memory grants handling this table with a couple of NVARCHAR(4000) columns. Thing is these columns are never larger than NVARCHAR(260).

Using

ALTER TABLE [table] ALTER COLUMN [col] NVARCHAR(260) NULL

results in SQL Server rewriting the entire table (and using 2x table size in log space), which is billions of rows, only to change nothing, isn't an option. Increasing the column width doesn't have this problem, but decreasing it does.

I have tried creating a constraint CHECK (DATALENGTH([col]) <= 520) or CHECK (LEN([col]) <= 260) and SQL Server still decides to re-write the entire table.

Is there any way to alter the column data type as a metadata-only operation? Without the expense of rewriting the entire table? I'm using SQL Server 2017 (14.0.2027.2 and 14.0.3192.2).

Here is a sample DDL table to use to reproduce:

CREATE TABLE [table](
    id INT IDENTITY(1,1) NOT NULL,
    [col] NVARCHAR(4000) NULL,
    CONSTRAINT [PK_test] PRIMARY KEY CLUSTERED (id ASC)
);

And then run the ALTER.

Best Answer

Is there any way to alter the column data type as a metadata-only operation?

I don't think so, this is how the product works right now. There are some really great workarounds to this limitation proposed in Joe's answer.

...results in SQL Server rewriting the entire table (and using 2x table size in log space)

I'm going to respond to the two parts of that statement separately.

Rewriting the Table

As I mentioned before, there's not really any way to avoid this. That seems to be the reality of the situation, even if it doesn't make complete sense from our perspective as customers.

Looking at DBCC PAGE before and after changing the column from 4000 to 260 shows that all of the data is duplicated on the data page (my test table had 'A' 260 times in the row):

Screenshot of data portion of dbcc page before and after

At this point, there are two copies of the exact same data on the page. The "old" column is essentially deleted (the id is changed from id=2 to id=67108865), and the "new" version of the column is updated to point to the new offset of the data on the page:

Screenshot of column metadata portions of dbcc page before and after

Using 2x Table Size in Log Space

Adding WITH (ONLINE = ON) to the end of the ALTER statement reduces the logging activity by about half, so this is one improvement you could make to reduce the amount of writes to disk / disk space needed.

I used this test harness to try it out:

USE [master];
GO
DROP DATABASE IF EXISTS [248749];
GO
CREATE DATABASE [248749] 
ON PRIMARY 
(
    NAME = N'248749', 
    FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017\MSSQL\DATA\248749.mdf', 
    SIZE = 2048000KB, 
    FILEGROWTH = 65536KB
)
LOG ON 
(
    NAME = N'248749_log', 
    FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017\MSSQL\DATA\248749_log.ldf', 
    SIZE = 2048000KB, 
    FILEGROWTH = 65536KB
);
GO
USE [248749];
GO

CREATE TABLE dbo.[table]
(
    id int IDENTITY(1,1) NOT NULL,
    [col] nvarchar (4000) NULL,

    CONSTRAINT [PK_test] PRIMARY KEY CLUSTERED (id ASC)
);

INSERT INTO dbo.[table]
SELECT TOP (1000000)
    REPLICATE(N'A', 260)
FROM master.dbo.spt_values v1
    CROSS JOIN master.dbo.spt_values v2
    CROSS JOIN master.dbo.spt_values v3;
GO

I checked sys.dm_io_virtual_file_stats(DB_ID(N'248749'), DEFAULT) before and after running the ALTER statement, and here are the differences:

Default (Offline) ALTER

  • Data file writes / bytes written: 34,809 / 2,193,801,216
  • Log file writes / bytes written: 40,953 / 1,484,910,080

Online ALTER

  • Data file writes / bytes written: 36,874 / 1,693,745,152 (22.8 % drop)
  • Log file writes / bytes written: 24,680 / 866,166,272 (41 % drop)

As you can see, there was a slight drop in the data file writes, and a major drop in the log file writes.