Do I need to use NVARCHAR rather than VARCHAR to properly store UTF8 characters in Oracle

character-setoracle

I want my Oracle database to properly store ALL possible UTF8 characters. If NLS_CHARSET is set to AL32UTF8 is that all I need to do? Or do I need to also make sure I use NCHAR and NVARCHAR rather than CHAR and VARCHAR to properly store UTf8 characters?

Best Answer

Good question. I'll give a simplified answer.

Oracle supports two character sets simultaneously, by way of different datatypes and parameters. A "normal" database-wide characterset and a "national" characterset.

Now, the "normal" characterset affects the way that VARCHAR2, CHAR and CLOB data is stored. The "national" characterset affects the way that NVARCHAR2, NCHAR and NCLOB data is stored.

These are dictated at database creation time. For example:

CREATE DATABASE .... 
CHARACTER SET AL32UTF8
NATIONAL CHARACTER SET AL16UTF16 
..... BLAH;

You can see these by checking in your DB. For example:

SQL> select * from nls_database_parameters where parameter like '%CHARACTERSET';

PARAMETER                      VALUE
------------------------------ ----------------------------------------
NLS_CHARACTERSET               WE8MSWIN1252
NLS_NCHAR_CHARACTERSET         AL16UTF16

SQL>

NLS_CHARACTERSET is for CHAR/VARCHAR2, NLS_NCHAR_CHARACTERSET is for NCHAR/NVARCHAR2.

If your normal characterset is UTF%, you're generally safe staying away from the N% datatypes unless you have some strange internationalisation requirements. If so, feel free to edit your question!