Oracle Performance – Varchar2 (x char) vs Varchar2 (x byte) Comparison

oraclevarchar

I know the definition diference between Varchar2 (x char) x Varchar2 (x byte).

varchar2(20 char) means you can store 20 characters — whereas varchar2(20) means you can store 20 bytes.

What about the performance?
Is there a problem if all my database is using Varchar2 (x char), even though my whole database is just using english?

Thanks!

Best Answer

Assuming that your database character set is AL32UTF8 (if there is a difference between the two, this is almost certainly your character set) and that all the data is characters that exist in the US7ASCII character set ("english" might constitute more than that depending on the definitions we're using) there won't be any noticable performance differences between a varchar2(n CHAR) and a varchar2(n BYTE).

Based on your database character set, Oracle has to do a tiny bit more work reading a string to figure out that every byte represents a single character rather than being part of a multi-byte character. But checking the length of a string in either characters or bytes is a pretty trivial operation. Unless you're doing something like running a TPC benchmark where you're already doing a crazy number of things that you'd never do in reality to get tiny fractions of performance improvement, it's not something worth worrying about.

If you were going to go crazy with a TPC benchmark level of effort to get every last thousandth of a percent, it would probably be slightly more efficient to use a database character set of US7ASCII and declare everything using byte semantics given that you only had to store English. Practically, though, if your application is so efficient that this sort of thing matters, though, you've already solved every performance problem so you should be focusing much more heavily on supportability and maintainability which would argue for an AL32UTF8 character set and character length semantics.

Related Solutions

Silent truncation of last character when using SSIS with Oracle

It looks like the problem was caused by a data flow task where I had two different data flows running in parallel. The same source tables were involved in both data sources but with different destinations. When I broke the task up into two tasks running one after the other, the truncation problem seems to have gone away.

Character Set Database Migration

The reason for the truncation is quite simple. Some characters (accented ones, for example) in the WE8ISO8859P1 character set are stored as a single byte, but in AL32UTF8 they end up being stored as multiple bytes. As a result of conversion, a 4000 character string may end up actually requiring more than 4000 bytes.

By way of example, this query shows you that the Euro symbol (0x80 in WE8ISO8859P1) becomes 2 bytes in AL32UTF8:

SQL>  select length(convert(chr(128),'AL32UTF8','WE8ISO8859P1')) from dual
  2  /
LENGTH(CONVERT(CHR(128),'AL32UTF8','WE8ISO8859P1'))
---------------------------------------------------
                                                  2
SQL>

To list all characters that will be affected by the change, you can use the following query:

with n as 
(
select level as c from dual
connect by level <= 255
)
select c as "WE8ISO8859P1 value",
       '"'||chr(c)||'"' as Character, 
       length(convert(chr(c),'AL32UTF8','WE8ISO8859P1')) as "New length"
from n
where length(convert(chr(c),'AL32UTF8','WE8ISO8859P1'))>1;

Unfortunately the maximum length a CHAR or VARCHAR string can be in Oracle is 4000 bytes. The only option available to you if characterset conversion pushes you over this limit is to convert the columns to use the CLOB datatype, but we warned - CLOBs are difficult to deal with and can present challenges.

Best Answer

Related Solutions

Silent truncation of last character when using SSIS with Oracle

Character Set Database Migration

Related Question