PostgreSQL JSONB – Why Does JSONB Use More Space for Storing Integers?

arraydatatypesjsonpostgresql

I did some small tests to try these data types, found some puzzling results.

Summary:

jsonb uses more space for integers than strings:

select pg_column_size('{"a":"1"}'::jsonb) as json_size

18

select pg_column_size('{"a":1}'::jsonb) as json_size

28

Why does the size of jsonb inflate when using integers?


Long version:

I was originally interested in using an array of integers, here are those tests. In each pair of comma separated numbers, first corresponds to json_size (bytes), second number is array_size (bytes).

select pg_column_size('[]'::jsonb) as json_size, 
       pg_column_size(array[]::smallint[]) as array_size

8,16

select pg_column_size('[1]'::jsonb) as json_size, 
       pg_column_size(array[1]::smallint[]) as array_size

20,26

Check if jsonb actually stores integers as plain text; if so then this next one should use 4 bytes more for jsonb:

select pg_column_size('[12345]'::jsonb) as json_size, 
       pg_column_size(array[12345]::smallint[]) as array_size

22,26

Strange, it uses 2 bytes more, no idea what that means. Try some more practical and realistic data now:

select pg_column_size('[123,234,345,456,567,678,789,890]'::jsonb) as json_size, 
       pg_column_size(array[123,234,345,456,567,678,789,890]::smallint[]) as array_size

104,40

What happened to jsonb here? What caused it to inflate? Maybe if I use smaller numbers that have less characters…

select pg_column_size('[1,2,3,4,5,6,7,8]'::jsonb) as json_size, 
       pg_column_size(array[1,2,3,4,5,6,7,8]::smallint[]) as array_size

104,40

Nope. What about strings?

select pg_column_size('["1","2","3","4","5","6","7","8"]'::jsonb) as json_size, 
       pg_column_size(array[1,2,3,4,5,6,7,8]::smallint[]) as array_size

48,40

It seems that jsonb likes strings and not integers.

I quickly made it a rule of thumb that I should only use strings in jsonb and use native integers or arrays for numeric data, but I don't know if this rule is misguided or why this anomaly occurs.

Best Answer

It's simple. JSON doesn't have a concept of an integer, only a "number". From the docs

When converting textual JSON input into jsonb, the primitive types described by RFC 7159 are effectively mapped onto native PostgreSQL types, as shown in Table 8-23.

That table shows a JSON "number" is mapped to a numeric. The max precision on numeric is

up to 131072 digits before the decimal point; up to 16383 digits after the decimal point

However, I don't think it actually stores that. It does store a numeric of some sort. The numeric source is quite frankly kind of difficult to follow.

Using an array of smallint is clearly better if it fits your workload. It has less wasted space for packing, and less wasted space for the data itself. smallint[] however

  • Does not support the same range of numbers.
  • Does not support embedding different types.
  • Does not support any precision at all.
  • Does not exchange well with web-clients.