SQL Server – Is Altering Collation on Existing Databases Safe?

collationsql serversql-server-2012

I'm new to SQL Server, and wasn't aware of the possibility that T-SQL may be case insensitive.

I have been normalizing entities by splitting tables and constraints. I'm scared that if I change the collation now, that some of those constraints may be invalid if, for example, a 'case insensitive' foreign key value points to a PK value with different case.

If I were to run this query on the DB (tested on a basic database):

ALTER DATABASE <db_name>
COLLATE SQL_Latin1_General_Cp1_CS_AS ;
GO

What would happen if some constraints were violated?

Is there any other reason that I should be careful making a database-wide change such as changing the collation?

Best Answer

Changing a database's Collation doesn't affect existing columns. It affects new non-XML string columns created that don't specify the COLLATE clause (including table variables), string literals, and variable values (not variable name resolution, which is determined by the Instance-level Collation). Meaning, something like the following will be affected:

IF (@Variable = 'string')
BEGIN
    ...
END;

This change will also affect Database-level meta-data, such as names of schemas, objects, columns, indexes, etc. Meaning, the following two scenarios will be affected:

SELECT ...
FROM   sys.indexes si
WHERE  si.[name] = N'somename'; -- real name = SomeName

and:

SELECT ...
FROM   dbo.sometable st -- real name = SomeTable

In both of those examples, they would work in a case-insensitive Collation, but return nothing or error, respectively, in a case-sensitive Collation.

Finally, as @JonathanFite was kind enough to remind me of, changing the DB Collation may impact queries involving temporary tables. The default Collation for string columns in temporary tables (not table variables) is the default Collation for [tempdb] (which should be the same as [model], which should be the Instance default, unless someone restored [model] from a server that had a different default Collation), not the Collation of the local Database. Meaning, even though the temp tables get created each time and thus you might expect them to act like "newly created tables" with the new Collation, they will in fact act like "existing tables" and will continue to behave as they did prior to the Collation change. If you need string columns in temporary tables to use the new Collation, you will need to explicitly set their Collation using COLLATE DATABASE_DEFAULT in the CREATE TABLE statements.

Hence you really need to do a lot of testing!

If you want to change existing columns then you will need to drop existing constraints, issue an ALTER TABLE ... ALTER COLUMN and then recreate the Constraints. You will also need to rebuild indexes that use any column that has its Collation changed as the sort order might be different.

Also, it is best to not use Collations starting with SQL_. Instead use Latin1_General_100_CS_AS. The Collations starting with SQL_ have been obsolete (even if not officially deprecated) since SQL Server 2000 was released. Their handling of VARCHAR / 8-bit data is obsolete and is not inline with newer behavior. Unfortunately, for backwards compatibility reasons, the default Collation for US English installations used to be the SQL_Latin1 Collations, as noted in the Using SQL Server Collations MSDN page:

For backward compatibility, the default English-language (US) collation is SQL_Latin1_General*.

This is also noted in the default Collation chart in the Collation Settings in Setup MSDN page (hit Control-F and paste in sql_latin). I believe this default changed to a Windows Collation starting in SQL Server 2014, but the documentation, even for the SQL Server 2016 setup, still points to the 2008 R2 setup page for Collations.


Below is a script to see some of the behavior differences when changing a Database's Collation:

USE [master];
GO

IF (DB_ID(N'ChangeDatabaseCollationTest') IS NULL)
BEGIN
    CREATE DATABASE [ChangeDatabaseCollationTest] COLLATE Latin1_General_100_CI_AS;
END;
GO

USE [ChangeDatabaseCollationTest];
GO
-- Current DB Collation: Latin1_General_100_CI_AS

EXEC sp_help 'sys.objects';
-- Collation for [name] = Latin1_General_100_CI_AS

IF ('A' = 'a')
BEGIN
    SELECT 'Case INsensitive comparison works.';
END;
ELSE
BEGIN
    SELECT 'Case INsensitive comparison did NOT work.';
END;
-- Case INsensitive comparison works.


CREATE TABLE dbo.CaseTest_a (ID INT); -- success
SELECT * FROM dbo.CaseTest_A; -- success


CREATE TABLE dbo.CaseTest_A (ID INT); -- error:
-- Msg 2714, Level 16, State 6, Line 5
-- There is already an object named 'CaseTest_A' in the database.




ALTER DATABASE [ChangeCollationTest] COLLATE Latin1_General_100_CS_AS; -- success

IF ('A' = 'a')
BEGIN
    SELECT 'Case INsensitive comparison works.';
END;
ELSE
BEGIN
    SELECT 'Case INsensitive comparison did NOT work.';
END;
-- Case INsensitive comparison did NOT work.


SELECT * FROM dbo.CaseTest_A; -- error:
-- Msg 208, Level 16, State 1, Line 56
-- Invalid object name 'dbo.CaseTest_A'.


CREATE TABLE dbo.CaseTest_A (ID INT); -- success

EXEC sp_help 'sys.objects';
-- Collation for [name] = Latin1_General_100_CS_AS


ALTER DATABASE [ChangeCollationTest] COLLATE Latin1_General_100_CI_AS; -- error:
-- Msg 1505, Level 16, State 1, Line 23
-- The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for
--    the object name 'dbo.sysschobjs' and the index name 'nc1'. The duplicate key
--    value is (0, 1, CaseTest_A).
-- Msg 5072, Level 16, State 1, Line 23
-- ALTER DATABASE failed. The default collation of database 'ChangeCollationTest'
--    cannot be set to Latin1_General_100_CI_AS.