I did some small tests to try these data types, found some puzzling results.
Summary:
jsonb uses more space for integers than strings:
select pg_column_size('{"a":"1"}'::jsonb) as json_size
18
select pg_column_size('{"a":1}'::jsonb) as json_size
28
Why does the size of jsonb inflate when using integers?
Long version:
I was originally interested in using an array of integers, here are those tests. In each pair of comma separated numbers, first corresponds to json_size
(bytes), second number is array_size
(bytes).
select pg_column_size('[]'::jsonb) as json_size,
pg_column_size(array[]::smallint[]) as array_size
8,16
select pg_column_size('[1]'::jsonb) as json_size,
pg_column_size(array[1]::smallint[]) as array_size
20,26
Check if jsonb actually stores integers as plain text; if so then this next one should use 4 bytes more for jsonb:
select pg_column_size('[12345]'::jsonb) as json_size,
pg_column_size(array[12345]::smallint[]) as array_size
22,26
Strange, it uses 2 bytes more, no idea what that means. Try some more practical and realistic data now:
select pg_column_size('[123,234,345,456,567,678,789,890]'::jsonb) as json_size,
pg_column_size(array[123,234,345,456,567,678,789,890]::smallint[]) as array_size
104,40
What happened to jsonb here? What caused it to inflate? Maybe if I use smaller numbers that have less characters…
select pg_column_size('[1,2,3,4,5,6,7,8]'::jsonb) as json_size,
pg_column_size(array[1,2,3,4,5,6,7,8]::smallint[]) as array_size
104,40
Nope. What about strings?
select pg_column_size('["1","2","3","4","5","6","7","8"]'::jsonb) as json_size,
pg_column_size(array[1,2,3,4,5,6,7,8]::smallint[]) as array_size
48,40
It seems that jsonb likes strings and not integers.
I quickly made it a rule of thumb that I should only use strings in jsonb and use native integers or arrays for numeric data, but I don't know if this rule is misguided or why this anomaly occurs.
Best Answer
It's simple. JSON doesn't have a concept of an integer, only a "number". From the docs
That table shows a JSON "number" is mapped to a numeric. The max precision on numeric is
However, I don't think it actually stores that. It does store a numeric of some sort. The numeric source is quite frankly kind of difficult to follow.
Using an array of smallint is clearly better if it fits your workload. It has less wasted space for packing, and less wasted space for the data itself.
smallint[]
however