Sql-server – the SQL Server equivalent for Oracle character set AL32UTF8

azure-sql-managed-instancecollationdatatypesoraclesql server

While porting data from oracle to Azure sql database I am facing the following issue where, empty data in Oracle is converted as ➞➞➞ in SQL Server

Pic1: Data as in Oracle DB;

Pic2: Data as in Sql server DB;

Pic3: Data when pasted in Notepad++.

The character set in Oracle is AL32UTF8 and that of target SQL Server database is SQL_Latin1_General_CP1_CI_AS. I have tried changing the datatype to NVARCHAR / NCHAR and changing the collation of the column to SQL_Latin1_General_CP850_CI_AS, but no solution. What is the reason for this difference and is there any solution?

Best Answer

NVARCHAR is always UTF-16.

You can try Latin1_General_100_CI_AS_SC_UTF8 with a VARCHAR column, but that doesn't look like an encoding problem. It really shouldn't matter what the source encoding / collation was as long as you correctly indicate the encoding used to transfer the data.

Either way, those 3 bytes / characters could be a bug in the import tool/process or the export tool/process. How are you transferring the data? If the image you are showing from Notepad++ is from the file being used to migrate the data and the field is supposed to be empty, either the export messed up, or perhaps the field in Oracle is not truly empty.

Related Solutions

Oracle XE 11.2 Export / Import Charset

The export utility will use the NLS_LANG environment variable specified for the client session. If all your data can be represented in the Windows-1252 character set, that shouldn't be an issue. If you want to do the export using the AL32UTF8 character set, you'd need to set the NLS_LANG. In Windows, that would be something like

c:\> set nls_lang = american_america.al32utf8

Oracle 10.2g Character Set Migration

Yes, WE8ISO8859P1 is a subset of AL32UTF8, though a small bit of conversion might be needed (which CSALTER will deal with & CSSCAN will inform you of when you do a preliminary scan).

A must-read for Oracle 10.2 is here.

Doing a full export/import will be more time-consuming than using csscan/csalter.

Another good, albeit old, read is the Oracle Character Migration Best Practices white paper.

Best Answer

Related Solutions

Oracle XE 11.2 Export / Import Charset

Oracle 10.2g Character Set Migration

Related Question