Sql-server – SQL Server 2016 Collation and Internal table Glitch

collationsql serversql-server-2016

My Steps:

  1. installed SQL 2016 with collation SQL_Latin1_General_CP850_CI_AS
  2. Restored Database to Stage instances with same collation as Database.(original Backup from 2012)
  3. Ran > "C:\Program Files\Microsoft SQL Server\MSSQL13.STAGE\MSSQL\Binn\sqlservr.exe" -m -T4022 -T3659 -s"STAGE" -q"SQL_Latin1_General_CP1_CI_AS"
    this was to Changed the Instances and all database collation to SQL_Latin1_General_CP1_CI_AS. (what doesn't get changed User Type Tables)
  4. Run Query

    Select i.name, c.collation_name,i.internal_type,i.internal_type_desc
    from sys.columns c inner join sys.internal_tables i on c.object_id=i.object_id
    WHERE c.collation_name <> 'SQL_Latin1_General_CP1_CI_AS'
    AND object_name(c.object_id) NOT LIKE 'sys%'
    AND object_name(c.object_id) NOT LIKE 'queue%'
    AND object_name(c.object_id) NOT LIKE 'file%'
    AND object_name(c.object_id) NOT LIKE 'spt%'
    AND object_name(c.object_id) NOT LIKE 'MSrep%'
    

    Will return the internal tables with Old collation.

    name         collation_name                  internal_type internal_type_desc
    sqlagent_jobs SQL_Latin1_General_CP850_CI_AS   220         CONTAINED_FEATURES
    

Test 2:

  1. Create New database on 2016 when the Instances collation is set to SQL_Latin1_General_CP850_CI_AS.
    new database collation has SQL_Latin1_General_CP850_CI_AS.

  2. Run this Query where collation = SQL_Latin1_General_CP1_CI_AS

    Select i.name, c.collation_name,i.internal_type,i.internal_type_desc
    from sys.columns c inner join sys.internal_tables i on c.object_id=i.object_id
    WHERE c.collation_name = 'SQL_Latin1_General_CP1_CI_AS'
    AND object_name(c.object_id) NOT LIKE 'sys%'
    AND object_name(c.object_id) NOT LIKE 'queue%'
    AND object_name(c.object_id) NOT LIKE 'file%'
    AND object_name(c.object_id) NOT LIKE 'spt%'
    AND object_name(c.object_id) NOT LIKE 'MSrep%'
    

    this will return results that don't match the current collation.
    name collation_name internal_type internal_type_desc
    sqlagent_jobs SQL_Latin1_General_CP1_CI_AS 220 CONTAINED_FEATURES

I don't think the application will be affected by this internal_table.
But still strange that the internal_table collation does not match, on Creating and after converting the instances to a new collation.

If I do the same process but this time on a 2012 Staging instances.

  1. restore Database to a 2012 Stagging instances with collation SQL_Latin1_General_CP850_CI_AS
  2. Ran > "C:\Program Files\Microsoft SQL Server\MSSQL11.STAGE12\MSSQL\Binn\sqlservr.exe" -m -T4022 -T3659 -s"STAGE" -q"SQL_Latin1_General_CP1_CI_AS"
  3. fix User Type Tables
  4. Backup converted database.
  5. restore database on SQL 2016 with collation SQL_Latin1_General_CP1_CI_AS.
  6. Run Query

    Select i.name, c.collation_name,i.internal_type,i.internal_type_desc
    from sys.columns c inner join sys.internal_tables i on c.object_id=i.object_id
    WHERE c.collation_name <> 'SQL_Latin1_General_CP1_CI_AS'
    AND object_name(c.object_id) NOT LIKE 'sys%'
    AND object_name(c.object_id) NOT LIKE 'queue%'
    AND object_name(c.object_id) NOT LIKE 'file%'
    AND object_name(c.object_id) NOT LIKE 'spt%'
    AND object_name(c.object_id) NOT LIKE 'MSrep%'
    

Nothing returns because I converted it in 2012.

If anyone can explain what is going? Is this a bug?
Anyone know how to fix the internal table collation?

Best Answer

I have looked at the internal table columns collations across most of my test instances and they are all either SQL_Latin1_General_CP1_CI_AS or Latin1_General_BIN. Some instances are SQL Server Express LocalDB, which are always SQL_Latin1_General_CP1_CI_AS at the instance level (unfortunately), but I have a 2017 Express installed with Korean_100_CS_AS_KS_WS_SC at the instance-level and a 2019 CTP 2.3 installed with UTF8_BIN2 at the instance-level. I have DBs restored into the 2019 instance (from CTP 2.2, not an earlier version as in your case) with collations of French_100_CI_AS_SC_UTF8 and French_100_CI_AS.

I have another SQL Server 2017 Express instance currently using Estonian_100_CS_AS_SC at the instance and DBs levels, and that instance has been changed using the -q option many times for testing said option. It even has a Contained DB (containment_type = PARTIAL), and all columns in Internal Tables are either SQL_Latin1_General_CP1_CI_AS or Latin1_General_BIN, no exceptions.

I would not worry about these columns. They are clearly all MS Shipped objects that desire to keep their specific collation. And you will never have a time when 100% of collations are what you set the instance to. There are collations used for meta-data that have their own static collation. There are collations coming from expressions in system catalog views and functions with explicit collations set, etc. You can use the following query to see all of the collations that are not what you set the instance and database to:

SELECT OBJECT_SCHEMA_NAME(col.[object_id]) AS [SchemaName],
       ISNULL(obj.[name], OBJECT_NAME(col.[object_id])) AS [ObjectName],
       col.[name] AS [ColumnName],
       col.[collation_name] AS [Collation],
       obj.[type_desc],
       ISNULL(obj.[is_ms_shipped],
              OBJECTPROPERTY(col.[object_id], 'IsMSShipped')) AS [IsMSShipped],
       col.[object_id]
FROM   sys.[all_columns] col
LEFT JOIN sys.[all_objects] obj
       ON obj.[object_id] = col.[object_id]
WHERE  col.[collation_name] IS NOT NULL
AND    col.[collation_name] NOT IN (
                         CONVERT([sysname], SERVERPROPERTY('collation')),
                         CONVERT([sysname], DATABASEPROPERTYEX(DB_NAME(), 'collation')))
ORDER BY [SchemaName], [ObjectName], [ColumnName];

Some objects are not in sys.all_objects, such as sys.pdw_nodes_partitions, which is the same object_id everywhere (across instances and even versions of SQL Server, but did not exist in SQL Server 2012):

SELECT OBJECT_SCHEMA_NAME(-1046288262) AS [SchemaName],
       OBJECT_NAME(-1046288262) AS [ObjectName],
       OBJECTPROPERTY(-1046288262, 'IsExecuted') AS [IsExecuted],
       OBJECTPROPERTY(-1046288262, 'IsView') AS [IsView],
       OBJECTPROPERTY(-1046288262, 'IsMSShipped') AS [IsMSShipped];

Still, I am unsure how to explain the results you saw for Test 1. You said:

Will return the internal tables with Old collation.

Based on everything I found across all of those versions, etc in the top part of this answer, it should not ever be possible to get a column in sqlagent_jobs to be anything other than SQL_Latin1_General_CP1_CI_AS. This could be a fluke with the database upgrade process when creating those internal tables. The "CONTAINED_FEATURES" tables did not exist in SQL Server 2012, so when you restored into SQL Server 2016, they had to be created at that point. And your 2016 collation is SQL_Latin1_General_CP850_CI_AS, which is the only connection I see so far.

I WAS ABLE TO REPRODUCE THIS!! I backed up a DB from SQL Server 2012 LocalDB that had a collation of Latin1_General_100_CS_AS_KS_SC. I then restored that into SQL Server 2019 (the instance-level collation being UTF8_BIN2). Checking the internal tables I saw that the 3 queue_messages_* tables and the filestream_tombstone_2073058421 table (all of which existed in SQL Server 2012) still had a collation of Latin1_General_BIN. BUT, the sqlagent_* tables and plan_persist_* tables (none of which existed in SQL Server 2012), all had a collation of Latin1_General_100_CS_AS_KS_SC (same as the DB that was restored).

I then backed up a DB from SQL Server 2014 LocalDB that had a collation of Latin1_General_100_CS_AS_KS_WS_SC. I then restored that into SQL Server 2019 (the instance-level collation being UTF8_BIN2). Checking the internal tables I saw that the 3 queue_messages_* tables and the filestream_tombstone_2073058421 table (all of which existed in SQL Server 2012) still had a collation of Latin1_General_BIN. BUT, this time the sqlagent_* tables (which did exist in SQL Server 2014), all had a collation of SQL_Latin1_General_CP1_CI_AS. The plan_persist_* tables (only 2 of the 6 existed in SQL Server 2014), all had a collation of Latin1_General_100_CS_AS_KS_WS_SC (again, same as the DB that was restored).

I changed the compatibility mode of the DB that was restored from SQL Server 2012 to be "SQL Server 2019 (150)", but that did not fix the collation issue.

This is clearly a problem with the Database Upgrade process, and has nothing to do with the sqlservr.exe -q method of changing all collations on a server (which is officially undocumented and unsupported, but I did document it here: Changing the Collation of the Instance, the Databases, and All Columns in All User Databases: What Could Possibly Go Wrong?). I have reported this bug to Microsoft via the barely functional, substandard-even-by-2008-2010-standards reporting site, UserVoice: Restoring databases into newer versions of SQL Server will create missing Internal Tables with incorrect collation.

Regarding Test 2:

collation = SQL_Latin1_General_CP1_CI_AS

Yes, it is to be expected that those internal tables will use that old, should-have-been-removed-by-now, collation.

Regarding Test 3:

Nothing returns because I converted it in 2012.

Not true. Nothing returns because:

  1. You filtered out the 3 queue tables via AND object_name(c.object_id) NOT LIKE 'queue%', and those use Latin1_General_BIN.
  2. You filtered out all of the other tables via WHERE c.collation_name <> 'SQL_Latin1_General_CP1_CI_AS'.