Postgresql – do with the new json field

jsonpostgresqlpostgresql-9.2postgresql-9.3

PostgreSQL 9.2 introduced the json field type. Why and when should I use it? What benefits does it have over a text field?

I thought there were new query options available, however I haven't seen any. Am I missing something?

Best Answer

Postgres 9.2

The benefit of the new feature is two-fold. Columns of type json verify the validity of its content so that what's in the column is automatically valid JSON and you get an error if you try to write anything else to it.
And you have basic functions to create valid JSON on the fly from rows or arrays - which is a very common use-case.

I quote Andrew Dunstan on the pgsql-hackers list:

At some stage there will possibly be some json-processing (as opposed to json-producing) functions, but not in 9.2.

I used that quote before under this related question on SO.

Postgres 9.3

.. finally brings a number of functions and operators. Check out the manual page for JSON functions.

Postgres 9.4

Be sure to check out the new jsonb type with a host of new functionality.

Above all, the decomposed, binary storage allows for smaller storage on disk and an equality operator for jsonb (unlike json), which makes a number of additional operations possible (like DISTINCT or a UNIQUE index).

Yet more functions have been added for both json and jsonb. json_to_record(), json_to_recordset() etc. More in the release notes.

Related Solutions

Postgresql – Postgres scalability – What is the impact of connection pooling

I think you see a false dichotomy that does not exist.

It can be useful to have connection pooling in place even if you expect a 1:1 mapping of clients to back-ends. If your connections are long-lived, you won't benefit from reducing backend setup/teardown overhead, as it's small and amortized across a long period. A pool like PgBouncer may remain useful for other reasons:

Block until a connection is available rather than return an immediate max_connections exceeded error;
You can switch the pool target server if you fail-over to a standby without having to reconfigure applications;
You can limit application database workers to lower than max_connections, so you can still make reporting and maintenance connections as a non-superuser.

Additionally, if suitable for your application you can use transaction-level pooling to greatly increase the number of clients that can be served by your server.

I would not try to keep a strict 1:1 mapping from Apache workers to PostgreSQL workers, personally. If you've got (say) 16 cores and good I/O on your PostgreSQL box you might want something like 16-20 active PostgreSQL workers for optimal performance. You're almost certain to want more Apache workers than that, since they'll be kept busy by things like:

persistent HTTP connections from idle clients;
Unresponsive or very slow clients;
Intentional DoS connections;
Network interruptions between client and server; etc

If possible, consider a transaction-pooling design with short-lived transactions instead.

Postgresql nested row_to_json not working

What about:

with tweets(type, id) as ( 
    select 'add' as type, tweets.id from tweets 
) select row_to_json(row(array_agg(tweets))) from tweets;

This is actually one of the really cool things about PostgreSQL, arrays, and complex types. One can aggregate them and then convert. You should get a structure which is effectively:

tweets[] (i.e. an array of tweet tuples) which then get should converted to JSON properly).

Now, if you need a more complex structure than this, you probably want to define types separately. Something like:

CREATE TYPE tweet_set AS (
    set_id int,
    tweets tweet[],
    generated_at timestamp, 
    ....
);

And then come up with a query to generate that, and then convert it to JSON.

One of the basic issues is that since row_to_json expects a record type, you have to pass in a record, not an anyarray. You could also do:

with tweets(type, id) as ( 
    select 'add' as type, tweets.id from tweets 
) select array_to_json(array_agg(tweets)) from tweets;