Sql-server – Nvarchar to varchar to nvarchar characters conversion

alter-tablecollationencodingsql serverunicode

I had a table filled with some greek text rows as nvarchar(2000).

Lately, I altered the type of the column to varchar(4000) but I realized that some greek characters were shown as question marks.

So, I tried to change back to nvarchar(4000) in order to fix it as I suppose that the characters' unicode still remains the same.

I am just wondering, is there any way to fix this instead of restoring the backup I created before altering the table?

Best Answer

No, there is no way to "fix" the data because the data is no longer there. When you converted to VARCHAR, the underlying values for each character were changed into the ASCII value for ?. This is not a display issue, those characters are now physically a regular question mark. You will need to do a restore from a backup, unfortunately.

The following example code shows that once a Unicode character is converted to VARCHAR (assuming the Code Page denoted by the Collation does not support that character), that it becomes a regular 'ol question mark, and forever shall it remain as such:

DECLARE @Character NCHAR(1) = NCHAR(0x3525);
SELECT @Character AS [TheCharacter],
       UNICODE(@Character) AS [DecimalCodePoint],
       ASCII(@Character) AS [DecimalValueOfVarchar],
       UNICODE(CONVERT(VARCHAR(5), @Character)) AS [ConvertToVarchar], 
       UNICODE(CONVERT(NVARCHAR(5), CONVERT(VARCHAR(5), @Character)))
             AS [ConvertToVarcharAndBack], 
       ASCII('?') AS [VarcharQuestionMark],
       UNICODE(N'?') AS [UnicodeQuestionMark];

-- 㔥    13605   63  63  63  63  63

The following example shows an instance of a Unicode character that is highly doubtful (at least at this time) to be supported in most fonts, hence it appears as a square box, but the UNICODE built-n function shows that the underlying code is still the correct Unicode Code Point:

SELECT NCHAR(0xABBF), N'ꮿ', UNICODE(N'ꮿ');

-- ꮿ    ꮿ   43967

The actual character can be seen here: Cherokee Small Letter YA U+ABBF. This is a display issue, and many characters that are not represented in various fonts will display in the same manner without altering the actual value of the character, but they are still distinct characters.