I have a table with about 100MM rows that stores information about a user.
CREATE TABLE [dbo].[UserData](
[UserDataID] [int] IDENTITY(1,1) NOT NULL,
[UserID] [int] NOT NULL,
[DataId] int NOT NULL,
[DataValue] [nvarchar](4000) NOT NULL,
[EncryptedDataValue] [varbinary](max) NULL)
I need to be able to store data in DataValue that is > nvarchar(4000). But, only a very small percentage of the rows actually need this column as nvarchar(max). I know that as soon as DataValue is over nvarchar(4000), internally, sql will store the data as a blob[?], substantially increasing the time it takes to make this change. (not sure how read/write time will be affected later).
I thought of a few potential options…
-
Change DataValue from nvarchar(4000) to nvarchar(max) and just eat the time it takes to make the change; not worrying that only 1% of the rows are using
MAX
? -
Alongside DataValue, add a DataValueXL column that is nvarchar(max) and introduce application logic to save in the appropriate column depending on the size of the data? (Marking both as
NULL
) -
Create a new table FKed to UserDataId to store only large DataValues > 4000?
Which–if any–should I go with?
thanks
Best Answer
This is incorrect. SQL Server will use off-row storage for both max and non-max types. In other words your NVARCHAR(4000) may also be stored off-row. The actual storage location will depend on whether the row fits or not on a page and on the exact combination of
sp_tableoption
values for thelarge value types out of row
andtext in row
settings.Your best bet is to change it to max. SQL will store the value in-row whenever possible, offering fast access. This makes application programing much simpler as you don't have to deal with the actual location (Column) based on size. You should only be concerned if you have an objection on adding the first max type column to a table (eg. it will prevent online rebuild operations), but you are already willing to add one so that should not be an issue.