SQL Server – Fixing Msg 6355 XML to Collation Conversion Error

collationencodingexecution-plansql serverxml

I like to find missing indexes on the go, looking at the execution plans!

It can potentially give me an indication where further to look at if I want to improve something that is currently running.

For doing this I use the following query:

SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED


SELECT 
    er.session_id,
    er.blocking_session_id,
    er.start_time,
    er.status,
    dbName = DB_NAME(er.database_id),
    er.wait_type,
    er.wait_time,
    er.last_wait_type,
    er.granted_query_memory,
    er.reads,
    er.logical_reads,
    er.writes,
    er.row_count,
    er.total_elapsed_time,
    er.cpu_time,
    er.open_transaction_count,
    er.open_transaction_count,
    s.text,
    qp.query_plan,
    logDate = CONVERT(DATETIME,GETDATE()),
    logTime = CONVERT(DATETIME,GETDATE())
FROM sys.dm_exec_requests er 
CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) s
CROSS APPLY sys.dm_exec_query_plan(er.plan_handle) qp
WHERE 
    CONVERT(VARCHAR(MAX), qp.query_plan) LIKE '%<missing%'

It generally works fine; however, I have recently encountered a problem with collation and XML:

enter image description here

Error message says:

Msg 6355, Level 16, State 1, Line 40
Conversion of one or more characters from XML to target collation impossible

I have already found out what is causing it:

 -- get only the applications from Italy:
exec usp_sel_outstandingItems            
                                    @startdate='2021-04-07 00:00:00',           
                                    @endDate='2021-08-15 00:00:00',           
                                    @statusDateStart=NULL,           
                                    @statusDateEnd=NULL,           
                                    @office=N'UK',           
                                    @country=N'IT ',           
                                    @userState=N'ParticipantPlaced',           
                                    @outstandingBalance=0

My question is:

What inside my query gets upset about the collation?
What can I do to possibly get over this error?

collate hints?

select @@version

Microsoft SQL Server 2016 (SP2-CU15-GDR) 
(KB4583461) - 13.0.5865.1 (X64)   Oct 31 2020 02:43:57   
Copyright (c) Microsoft Corporation  Enterprise Edition: 
Core-based Licensing (64-bit) 
on Windows Server 2012 R2 Datacenter 6.3 <X64> (Build 9600: ) 
(Hypervisor) 

Best Answer

The problem is that you are converting XML data, which is stored as Unicode (UTF-16 specifically), into 8-bit data. 8-bit data requires a code page to know which set of up to 256 characters to use (or up to approximately 24k characters if using a Double-Byte Character Set, but the vast majority of the time it's 256 characters max). The issue here is that there is a character (or possibly several) in the XML data that either a) does not exist in the code page specified by the default collation of the current database, or b) does not exist in any code page.

For example, the following queries show a character that does not exist in most code pages (or maybe none) and so gets the same error when converting to VARCHAR, but converting to NVARCHAR works:

SELECT NCHAR(0x1234);
-- ሴ

DECLARE @Test XML;
SET @Test = CONVERT(XML, N'<a>' + NCHAR(0x1234) + N'</a>')
SELECT @Test;
-- <a>ሴ</a>

SELECT CONVERT(NVARCHAR(MAX), @Test);
-- <a>ሴ</a>

SELECT CONVERT(VARCHAR(MAX), @Test);
/*
Msg 6355, Level 16, State 1, Line XXXXX
Conversion of one or more characters from XML to target collation impossible
*/

Please convert to NVARCHAR(MAX) instead of VARCHAR(MAX). Your WHERE predicate should be:

CONVERT(NVARCHAR(MAX), qp.query_plan) LIKE N'%<missing%'

(please note the addition of the "N" prefixing the string literal — not exactly necessary, but a good habit to get into).

Converting XML data to VARCHAR is only truly safe if the query is executing in a database that has a UTF-8 collation as its default collation, and that is only available starting in SQL Server 2019.

Finally, to answer your question of "would forcing a collation via COLLATE help?": No, not when converting from XML. You can't add the COLLATE clause to the XML data as collation does not apply to XML (you would get "Expression type xml is invalid for COLLATE clause."). And, placing the COLLATE outside of the CONVERT() doesn't help as that is applied after the conversion, which here fails. You also cannot place the COLLATE just after the VARCHAR(MAX) as that is a datatype and not data (you would get "Incorrect syntax near the keyword 'COLLATE'.").


For more info on working with collations / encodings / Unicode, please visit my site: Collations Info