I am aware of when adding new fields to large tables, it is recommended to add them to the end of the fields rather than somewhere in the middle, and wondering if something like this applies when changing field types?
I have a table with about a million records that has several VARCHAR type fields. I would like to change these to NVARCHAR, but as I understand it, this will take some time and resources, as the fields are in the middle of the table, and SQL Server has to do a bunch of copying/re-ordering.
What is an efficient way of accomplishing this?
Best Answer
Directly answering the question, there are two ways to execute the operation.
Note on large tables: If the table has few thousands of records or less, you could do the operation at once. In the case of million-records tables, it is more practical to perform in batches (let's say 1000's or 100s of records each time).
Pseudo-temp columns (I forgot if there is another, more appropriate name) are columns used to store the result of a conversion. In this case, they would be also the final columns after the process.
This is the same process detailed in Aaron's answer.
When the modification is in more than an handful of columns, it is more practical to create a new table, based on the schema of the old one.
Note on step 4: If you have any duplicate indexes detected (detecting duplicate indexes are a very long subject, see Kimberly Tripp's blog on SQLSkills.com), that is your chance to get rid of them if is that the case.
Changing from VARCHAR to NVARCHAR have some implications on performace, for any SQL Server below 2008R2, at least. For SQL 2008 R2, Aaron Bertrand has some blog posts on the Unicode compression feature- which can counter balance when NVarchar columns are used to store content that can be stored on VARCHAR columns. I didn't read them completely as the articles deserve, but the subject is interesting.
NVARCHAR columns tipically (IOW, before 2008R2) store all the chars in the columns with 2 bytes per char. For example, the string 'MSSQL' will be stored in 5 bytes on a VARCHAR column and 10 on a NVARCHAR one. Since non-LOB string columns are limited to store a maximum of 8000 bytes, it means that VARCHAR can store 8000 characters, while NVARCHR are limited to 4000.
Implications of that facts:
EDIT: as gbn stated, does not worth create something just to use VARCHARs when you have a clear requisite that needs NVARCHAR columns to be fullfilled.