Sql-server – Encoding Debug UTF8 & Latin 1

sql serversql-server-2008-r2utf-8

We have on our DB some international people that got some characters saved incorrectly on our DB (MS SQL-Server 2008 R2).

For example we have strings like 'WilcoxonÃ'.

We got this data into the db because we take international submissions but management didn't want to use any character set even though we tried for years to make that happen. Now we finally convinced them to implement character set and new data is saved correctly.

What we are trying to do now, is do a search and replace for existing records (around 300 or so) with the incorrect data on them.

We've taken the UTF 8 debug table from here UTF8 Debug and turned it into SQL Table.

This is the code to create the table (I've tried using SQL Fiddle but it's taking ages to create the schema)

CREATE TABLE UTF8Encoding (
    id            int NOT NULL identity,
    [unicode]     nvarchar(250) NOT NULL default '',
    [windows1252] nvarchar(250) NOT NULL default '',
    [expected]    nvarchar(250) NOT NULL default '',
    [actual]      nvarchar(250) NOT NULL default '',
    [utf8bytes]   nvarchar(250) NOT NULL default '',
    PRIMARY KEY (id)
);

INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+20AC','0x80',N'€','€','%E2 %82 %AC');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('','0x81',N'','','');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+201A','0x82',N'‚','‚','%E2 %80 %9A');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+0192','0x83',N'ƒ','Æ’','%C6 %92');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+201E','0x84',N'„','„','%E2 %80 %9E');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2026','0x85',N'…','…','%E2 %80 %A6');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2020','0x86',N'†','â€','%E2 %80 %A0');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2021','0x87',N'‡','‡','%E2 %80 %A1');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+02C6','0x88',N'ˆ','ˆ','%CB %86');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2030','0x89',N'‰','‰','%E2 %80 %B0');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+0160','0x8A',N'Š','Å','%C5 %A0');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2039','0x8B',N'‹','‹','%E2 %80 %B9');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+0152','0x8C',N'Œ','Å’','%C5 %92');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('','0x8D',N'','','');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+017D','0x8E',N'Ž','Ž','%C5 %BD');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('','0x8F',N'','','');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('','0x90',N'','','');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2018','0x91',N'‘','‘','%E2 %80 %98');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2019','0x92',N'’','’','%E2 %80 %99');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+201C','0x93',N'“','“','%E2 %80 %9C');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+201D','0x94',N'”','â€','%E2 %80 %9D');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2022','0x95',N'•','•','%E2 %80 %A2');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2013','0x96',N'–','–','%E2 %80 %93');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2014','0x97',N'—','—','%E2 %80 %94');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+02DC','0x98',N'˜','Ëœ','%CB %9C');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+2122','0x99',N'™','â„¢','%E2 %84 %A2');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+0161','0x9A',N'š','Å¡','%C5 %A1');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+203A','0x9B',N'›','›','%E2 %80 %BA');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+0153','0x9C',N'œ','Å“','%C5 %93');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('','0x9D',N'','','');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+017E','0x9E',N'ž','ž','%C5 %BE');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+0178','0x9F',N'Ÿ','Ÿ','%C5 %B8');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A0','0xA0',N'','Â','%C2 %A0');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A1','0xA1',N'¡','¡','%C2 %A1');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A2','0xA2',N'¢','¢','%C2 %A2');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A3','0xA3',N'£','£','%C2 %A3');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A4','0xA4',N'¤','¤','%C2 %A4');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A5','0xA5',N'¥','Â¥','%C2 %A5');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A6','0xA6',N'¦','¦','%C2 %A6');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A7','0xA7',N'§','§','%C2 %A7');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A8','0xA8',N'¨','¨','%C2 %A8');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00A9','0xA9',N'©','©','%C2 %A9');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00AA','0xAA',N'ª','ª','%C2 %AA');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00AB','0xAB',N'«','«','%C2 %AB');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00AC','0xAC',N'¬','¬','%C2 %AC');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00AD','0xAD',N'','­','%C2 %AD');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00AE','0xAE',N'®','®','%C2 %AE');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00AF','0xAF',N'¯','¯','%C2 %AF');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B0','0xB0',N'°','°','%C2 %B0');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B1','0xB1',N'±','±','%C2 %B1');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B2','0xB2',N'²','²','%C2 %B2');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B3','0xB3',N'³','³','%C2 %B3');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B4','0xB4',N'´','´','%C2 %B4');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B5','0xB5',N'µ','µ','%C2 %B5');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B6','0xB6',N'¶','¶','%C2 %B6');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B7','0xB7',N'·','·','%C2 %B7');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B8','0xB8',N'¸','¸','%C2 %B8');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00B9','0xB9',N'¹','¹','%C2 %B9');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00BA','0xBA',N'º','º','%C2 %BA');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00BB','0xBB',N'»','»','%C2 %BB');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00BC','0xBC',N'¼','¼','%C2 %BC');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00BD','0xBD',N'½','½','%C2 %BD');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00BE','0xBE',N'¾','¾','%C2 %BE');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00BF','0xBF',N'¿','¿','%C2 %BF');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C0','0xC0',N'À','À','%C3 %80');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C1','0xC1',N'Á','Ã','%C3 %81');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C2','0xC2',N'Â','Â','%C3 %82');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C3','0xC3',N'Ã','Ã','%C3 %83');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C4','0xC4',N'Ä','Ä','%C3 %84');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C5','0xC5',N'Å','Ã…','%C3 %85');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C6','0xC6',N'Æ','Æ','%C3 %86');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C7','0xC7',N'Ç','Ç','%C3 %87');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C8','0xC8',N'È','È','%C3 %88');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00C9','0xC9',N'É','É','%C3 %89');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00CA','0xCA',N'','Ê','%C3 %8A');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00CB','0xCB',N'Ë','Ë','%C3 %8B');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00CC','0xCC',N'Ì','ÃŒ','%C3 %8C');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00CD','0xCD',N'Í','Ã','%C3 %8D');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00CE','0xCE',N'Î','ÃŽ','%C3 %8E');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00CF','0xCF',N'Ï','Ã','%C3 %8F');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D0','0xD0',N'Ð','Ã','%C3 %90');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D1','0xD1',N'Ñ','Ñ','%C3 %91');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D2','0xD2',N'Ò','Ã’','%C3 %92');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D3','0xD3',N'Ó','Ó','%C3 %93');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D4','0xD4',N'Ô','Ô','%C3 %94');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D5','0xD5',N'Õ','Õ','%C3 %95');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D6','0xD6',N'Ö','Ö','%C3 %96');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D7','0xD7',N'×','×','%C3 %97');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D8','0xD8',N'Ø','Ø','%C3 %98');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00D9','0xD9',N'Ù','Ù','%C3 %99');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00DA','0xDA',N'Ú','Ú','%C3 %9A');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00DB','0xDB',N'Û','Û','%C3 %9B');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00DC','0xDC',N'Ü','Ãœ','%C3 %9C');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00DD','0xDD',N'Ý','Ã','%C3 %9D');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00DE','0xDE',N'Þ','Þ','%C3 %9E');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00DF','0xDF',N'ß','ß','%C3 %9F');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E0','0xE0',N'à','Ã','%C3 %A0');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E1','0xE1',N'á','á','%C3 %A1');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E2','0xE2',N'â','â','%C3 %A2');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E3','0xE3',N'ã','ã','%C3 %A3');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E4','0xE4',N'ä','ä','%C3 %A4');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E5','0xE5',N'å','Ã¥','%C3 %A5');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E6','0xE6',N'æ','æ','%C3 %A6');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E7','0xE7',N'ç','ç','%C3 %A7');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E8','0xE8',N'è','è','%C3 %A8');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00E9','0xE9',N'é','é','%C3 %A9');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00EA','0xEA',N'ê','ê','%C3 %AA');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00EB','0xEB',N'ë','ë','%C3 %AB');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00EC','0xEC',N'ì','ì','%C3 %AC');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00ED','0xED',N'í','í','%C3 %AD');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00EE','0xEE',N'î','î','%C3 %AE');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00EF','0xEF',N'ï','ï','%C3 %AF');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F0','0xF0',N'ð','ð','%C3 %B0');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F1','0xF1',N'ñ','ñ','%C3 %B1');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F2','0xF2',N'ò','ò','%C3 %B2');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F3','0xF3',N'ó','ó','%C3 %B3');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F4','0xF4',N'ô','ô','%C3 %B4');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F5','0xF5',N'õ','õ','%C3 %B5');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F6','0xF6',N'ö','ö','%C3 %B6');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F7','0xF7',N'÷','÷','%C3 %B7');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F8','0xF8',N'ø','ø','%C3 %B8');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00F9','0xF9',N'ù','ù','%C3 %B9');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00FA','0xFA',N'ú','ú','%C3 %BA');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00FB','0xFB',N'û','û','%C3 %BB');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00FC','0xFC',N'ü','ü','%C3 %BC');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00FD','0xFD',N'ý','ý','%C3 %BD');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00FE','0xFE',N'þ','þ','%C3 %BE');
INSERT INTO UTF8Encoding (unicode,windows1252,expected,actual,utf8bytes) VALUES ('U+00FF','0xFF',N'ÿ','ÿ','%C3 %BF');

Now I'm trying to create a function or stored procedure to do a search and replace to see if it fixes the issue for those 300 or so records. The function I've got so far takes ages for that string mentioned above.

Here's the code I got so far

DECLARE @inputString VARCHAR(MAX)= 'WilcoxonÃ'
DECLARE @count INT = 4
DECLARE @id INT = 1
DECLARE @expected nvarchar(10), @actual nvarchar(10)
DECLARE @codePos INT, @codeEncoded VARCHAR(7), @startIndex INT
    , @resultString varchar(max)
SELECT @count=COUNT(*) FROM UTF8Encoding

SET @resultString = LTRIM(RTRIM(@inputString))

WHILE @id <=@count
BEGIN

    SELECT @expected = expected, @actual = actual
    FROM UTF8Encoding
    WHERE ID = @id

        SELECT @startIndex = PATINDEX('%' + @actual + '%', @resultString)

        While @startIndex > 8 
        BEGIN
            PRINT @resultString + '|'  + @actual + '|' + NCHAR(@expected)
            SET @resultString = REPLACE(@resultString, @actual, NCHAR(@expected))
            SET @startIndex=PATINDEX('%' + @actual + '%', @resultString)
        END
        SET @id=@id + 46
END

PRINT @resultString

GO

So if any of you have an idea on how to solve this to see if we can fix the encoding that would be great.

Thanks

Best Answer

The reason your function takes ages is because you have empty values for actual in UTF8Encoding. The patindex expression returns 1 when you check for an empty actual so you never exit the inner loop. You can fix that by adding and actual <> '' to the query against UTF8Encoding. Next issue is where you use @expected as parameter to nchar(). The parameter should be an integer so if you remove nchar() your code returns something but I don't think it is what you are looking for. Wilcoxonà is translated to WilcoxonÁƒƒÁ††™.

Another approach you can try is to use the XML capabilities in SQL Server. XML in SQL Server is UTF-16 but it is able to load UTF-8 encoded strings and that can be used.

Concatenate your string with a UTF-8 xml declaration and use the value() function to fetch the value from the constructed XML.

I guess you eventually want to use this on a table so here is an example that uses a table variable.

declare @T table(InputString varchar(max))

insert into @T values
('åäöÅÄÖ'),
('WilcoxonÃ')

select cast('<?xml version="1.0" encoding="UTF-8"?>'+T.InputString as xml).value('text()[1]', 'nvarchar(max)') as Value
from @T as T

Result:

Value
---------------
åäöÅÄÖ
WilcoxonÃ