Db2 – VARGRAPHIC vs VARCHAR CODEUNITS32

db2db2-luw

I'm in the process of migrating a number of databases from ISO8859-1 to UTF-8. To fit the old string in new tables I figured there are 3 options:

Multiply the number of bytes with a number the corresponds to the distribution of multibyte characters in the domain, and add check constraints for length validation. I.e CHAR(3) might hypothetically become CHAR(5).
Declare column as CHAR(3 CODEUNITS32). As far as I can tell this will occupy 12 bytes
Declare column as GRAPHIC(3). As far as I can tell this will occupy 6 bytes.

I don't really consider 1 an option, that leaves us with 2 and 3. When would 2 be preferable over 3? I think I've seen some info where some characters may not map correctly from CHAR to GRAPHIC, but this seems to work fine for characters from ISO8859-1. Other than that, am I missing something or would GRAPHIC always be preferably over CHAR CODEUNIT32?

Best Answer

The GRAPHIC data type seems to be a remnant from the times when there was no widespread Unicode support. It was there to facilitate storage of double-byte characters. In essence GRAPHIC(3) is equivalent to CHAR(3 CODEUNITS16).

Apart from the length semantics (1 byte vs. 2 byte characters) (VAR)CHAR and (VAR)GRAPHIC are functionally equivalent. Now that you have an option of specifying the length units along with the data type, I really see no reason to use the GRAPHIC types.

Since a UTF-8 character can occupy anywhere between 1 to 4 bytes, using any fixed-number-of-bytes character assumption would cause either risk of truncation or wasted space. To avoid the latter you might consider using VARCHAR instead of CHAR if your application can tolerate that.

Related Solutions

Db2 – eliminate extra spaces in VARCHAR columns

Thanks to @JNK's comment, select cast(trim(id) as varchar(6)) as id, cast(trim(voucher) as varchar(32)) as voucher, auth from my_table does the job.

DB2 – How to Alter Table from CLOB to VARCHAR

You cannot change a LOB type to a non LOB type and vice versa (I assume LUW here):

http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r0000888.html

[...] SET DATA TYPE altered-data-type Start of changeSpecifies the new data type of the column. The new data type must be castable from the existing data type of the column (SQLSTATE 42837). A LOB column cannot be altered to a different data type (SQLSTATE 42837). A non-LOB column cannot be altered to a LOB data type (SQLSTATE 42837).End of change.

One way is to unload the data with export:

db2 "export to t.ixf of ixf select ..., cast(varchar_clob as varchar(100)), ... from t"

then drop all foreign key pointing to this table, drop the table, create the new table, load the data back to the table and finally add the foreign keys that where dropped.

An alternative approach is to drop the foreign keys, rename the table, create the new table and use load from cursor. When done drop the renamed table

Best Answer

Related Solutions

Db2 – eliminate extra spaces in VARCHAR columns

DB2 – How to Alter Table from CLOB to VARCHAR

Related Question