Sql-server – Will altering an nvarchar(max) column to nvarchar(50) lock the table

alter-tableazure-sql-databasedatatypessql server

I have a table in SQL Server (Azure) with 1.4 million rows.

I'd like to index one of the columns; however, its datatype is nvarchar(max) (defaulted to such by using EF I imagine). I'm using Entity Framework and can make a migration that will alter this datatype to nvarchar(50).

My concern is how long this will take, and whether it will cause issues with locking. I've only ever expanded columns, so I'm not sure what to expect. I do know the data in this column will fit fine in the new size.

I was planning to create a copy of the database and try it out. I have also considered migrating to a new column, but would much prefer to update the current column if it's not overly painful.

Best Answer

There's no way for us to fully answer your question. Depends on way too many factors, including the table definition, other columns in the table, if the data is off page, the T-SQL generated by your ORM, and so on. The documentation is pretty good though:

WITH ( ONLINE = ON | OFF) Applies to: SQL Server 2016 through SQL Server 2017 and Azure SQL Database.

Allows many alter column actions to be performed while the table remains available. Default is OFF. Alter column can be performed on line for column changes related to data type, column length or precision, nullability, sparseness, and collation.

Online alter column allows user created and auto statistics to reference the altered column for the duration of the ALTER COLUMN operation. This allows queries to perform as usual. At the end of the operation, auto-stats that reference the column are dropped and user-created stats are invalidated. The user must manually update user-generated statistics after the operation is completed. If the column is part of a filter expression for any statistics or indexes then you cannot perform an alter column operation.

While the online alter column operation is running, all operations that could take a dependency on the column (index, views, etc.) will block or fail with an appropriate error. This guarantees that online alter column will not fail because of dependencies introduced while the operation was running.

Online alter column has similar requirements, restrictions, and functionality as online index rebuild. This includes:

Online index rebuild is not supported when the table contains legacy LOB or filestream columns or when the table has a columnstore index. The same limitations apply for online alter column.

An existing column being altered requires twice the space allocation; for the original column and for the newly created hidden column.

The locking strategy during an alter column online operation follows the same locking pattern used for online index build.

In addition it's also pretty straightforward to test. I'm on SQL Server 2017 testing with the default read committed isolation level. First I'll create the table:

DROP TABLE IF EXISTS dbo.CONVERT_ME;

CREATE TABLE dbo.CONVERT_ME (
    ID BIGINT NOT NULL,
    OVERWEIGHT_COLUMN VARCHAR(MAX) NOT NULL,
    PRIMARY KEY (ID)
);

INSERT INTO dbo.CONVERT_ME WITH (TABLOCK)
SELECT TOP (1500000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), REPLICATE('Z', 50)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
OPTION (MAXDOP 1);

Then I'll alter the column OFFLINE, which is the default:

SET STATISTICS IO, TIME ON;

BEGIN TRANSACTION;

ALTER TABLE dbo.CONVERT_ME
ALTER COLUMN OVERWEIGHT_COLUMN VARCHAR(50) NOT NULL WITH (ONLINE = OFF);

My machine this takes about three seconds. The operation takes a schema modification lock during the entire transaction which prevents any other lock from being taken on it, including querying it with NOLOCK.

enter image description here

Now I'll try it with WITH (ONLINE = ON). The operation is now eligible for parallelism, so it may run faster on your server. A schema modification lock is still taken on the object, but it occurs near the end instead of throughout the whole transaction. Here's an example snapshot of the lock table to show the schema modification locks held during most of the transaction:

enter image description here

Note that an object level schema modification lock has not yet been taken. Here is what it looks like after the operation has finished but before the transaction has committed:

enter image description here

A schema modification lock on the object, along with a few others, are taken briefly near the end.

So great, there should be no problem if you use WITH (ONLINE = ON), right? As I said before the time the operation takes is going to depend on a lot of factors. Consider a different table which has many more data pages:

DROP TABLE IF EXISTS dbo.CONVERT_ME;

CREATE TABLE dbo.CONVERT_ME (
    ID BIGINT NOT NULL,
    OTHER_COLUMN VARCHAR(8000) NOT NULL,
    OVERWEIGHT_COLUMN VARCHAR(MAX) NOT NULL,
    PRIMARY KEY (ID)
);

INSERT INTO dbo.CONVERT_ME WITH (TABLOCK)
SELECT TOP (1500000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), REPLICATE('Z', 4200), REPLICATE('Z', 50)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
OPTION (MAXDOP 1);

On my machine the same ALTER TABLE now takes 39635 ms when run OFFLINE and 117877 ms when run ONLINE.

My advice to you is to read the documentation until you feel confident that you understand the locking considerations and to test with your real table's schema populated with real data before doing the switch in production.

PLEASE NOTE:

Not sure about Azure SQL Database, but in regular SQL Server the ONLINE option is only for Enterprise Edition. Attempting to do this in another edition will result in the following error:

Msg 1712, Level 16, State 1, Line XXXXX
Online index operations can only be performed in Enterprise edition of SQL Server.