I am using Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production and I am trying to decide on using INTEGER
or NUMBER
for a whole number field that will hold a LONG
from Java.
For reference:
Long.BYTES = 8
Long.SIZE = 64
Long.MAX_VALUE = 9223372036854775807 / 2^63-1
I was lead to a relatively old blog post from an answer on Stack Overflow that suggests that you should never really use INTEGER
because it is has some signification performance issues compared to using NUMBER
. Even for just ID columns.
INTEGER is always slower then NUMBER. Since integer is a number with
added constraint. It takes additional CPU cycles to enforce the
constraint. I never watched any difference, but there might be a
difference when we load several millions of records on the INTEGER
column. If we need to ensure that the input is whole numbers, then
INTEGER is best option to go. Otherwise, we can stick with NUMBER data
type.
This article also states that "INTEGER is equivalent to NUMBER(38,0)" which I already knew. What I have not been able to determine if it is equivalent with the possible performance concerns that are raised.
- I do want to ensure that all the values are nothing but whole numbers.
- I will be working with millions of rows on a regular basis as well as joining on this column because it will be a foreign key to hundreds of tables.
The Oracle documentation says that
INTEGER
, INT
and SMALLINT
all map to NUMBER(p,0)
So I am planning on using NUMBER(38,0)
like I always have, lots of legacy databases and tables I am importing data from actually store this ID
information in a VARCHAR(50)
and the data is tremendously horrid trash because of people adding prefix/suffix characters to indicate state; like z343234
means disabled. So I am trying to clean this data up in the new systems and not allow the perversions that the previous developers/business analyst made. I am not concerned with saving the corrupt flags and other data, just the numeric part.
I work with so many different database engines I can not keep up with what the latest idioms are.
Best Answer
The BLOG
The logic behind that post is so ludicrous that it deserves a sarcastic answer.
In actuality, the number of clock-cycles needed to perform such check is so insignificant that it is easily hidden within the noise of normal operations of the computer. You probably couldn't measure the performance difference even on a VAX/VMS. (much older than "10 years")
INTEGER vs NUMBER vs PLS_INTEGER
As you have seen
INTEGER
isNUMBER(38,0)
. There should be no measurable performance difference.A difference can come in when you are talking about
PLS_INTEGER
vsNUMBER
. Most of the benchmarks I've seen that prove this to be true are computationally intensive. Since you aren't doing Differential Equations with this value, the "performance gain" wouldn't apply to your case.I recommend you stick with
INTEGER
orNUMBER(38,0)
Data Model Suggestion
The number and the flag should be in two different columns. Feel free to use a
virtual column
(3rd column) to get back the original string. I recommend that you record the original string value in a "comments/notes" like field so that the humans can search for it in case yourvirtual column
logic does match what the human actually typed in.