SQL Server Replace Function – Removing Ones When Removing char(185)

collationsql serversql-server-2008-r2temporary-tablesunicode

Ok as the title suggests this is occuring.

When using the REPLACE function on char(185) for data in a temporary table, ones ("1") are being removed as well.

SQL SERVER 2008R2
UNICODE Data

DB collation – collate SQL_Latin1_General_CP1_CS_AS
Tempdb has default

Any other details I could provide which might be helpful, let me know.

Best Answer

CHAR(185), being VARCHAR data and using an 8-bit code page, can be a different character depending on the code page being used (which is determined by the collation of the current database when using the CHAR() function). In many code pages, including the common Windows-1252 (Latin1_General), the character is a superscript 1 (i.e. "1" ). This character, depending on the collation / code page being used, might have a "best fit" mapping to a regular "1".

Also, when using a Windows collation, case-sensitivity can affect the outcome of matches between subscript, superscript, and regular versions of the same characters (seriously! see example below). This also covers NVARCHAR / Unicode data (including UTF-8 in VARCHAR starting in SQL Server 2019).

When needing to match on specific characters you need to use a binary collation (i.e. one ending in _BIN2 unless on SQL Server 2005 or older, in which case only the _BIN collations are available).

SELECT CHAR(185)
WHERE  CHAR(185) = '1' COLLATE SQL_Latin1_General_CP1_CI_AS
-- no rows (SQL Server collation, code page 1252, case insensitive)



SELECT CHAR(185)
WHERE  CHAR(185) = '1' COLLATE SQL_Latin1_General_CP437_CI_AS
-- 1 row (SQL Server collation, code page 437, case insensitive)

SELECT CHAR(185)
WHERE  CHAR(185) = '1' COLLATE SQL_Latin1_General_CP437_CS_AS
-- 1 row (SQL Server collation, code page 437, case sensitive)



SELECT CHAR(185)
WHERE  CHAR(185) = '1' COLLATE Latin1_General_100_CI_AS
-- 1 row (Windows collation, code page 1252, case insensitive)

SELECT CHAR(185)
WHERE  CHAR(185) = '1' COLLATE Latin1_General_100_CS_AS
-- no rows (Windows collation, code page 1252, case sensitive)



SELECT CHAR(185)
WHERE  CHAR(185) = '1' COLLATE Latin1_General_100_BIN2;
-- no rows (Windows collation, code page 1252, _BIN2 collation; always works!)

As noted above, Unicode data (regardless of type of collation) works the same as VARCHAR data used with a Windows collation (because both scenarios use the same Unicode rules):

SELECT NCHAR(185)
WHERE  NCHAR(185) = N'1' COLLATE Latin1_General_100_CI_AS
-- 1 row (Windows collation, Unicode data, case insensitive)

SELECT NCHAR(185)
WHERE  NCHAR(185) = N'1' COLLATE Latin1_General_100_CS_AS
-- no rows (Windows collation, Unicode data, case sensitive)