Postgresql – Which is more efficient for searches on JSON data in Postgres: GIN or multiple indexed columns

indexjsonpostgresqlpostgresql-9.6

For example, say I have a table with a medium number of rows (~100,000 or so) that has a jsonb column with the following example data in one of the rows:

{"name":"Bob", "favoriteColor":"red", "someOtherObject": {"somethingElse": true}}

Is it more efficient to:

  1. Use a GIN index on the jsonb column, then use Postgres' built-in JSON query functions to pull out data

  2. Create several columns to represent things that can be searched for i.e. one column for name, one column for favoriteColor, etc; build b-tree indices on those columns and then run a select document from table where name = 'Bob'

  3. Some other solution?

    Keep in mind the solution needs to efficiently support like queries to be able to search for values that start with a given input string.

Best Answer

The advantage of JSON is versatility: you can add any keys without changing the table definition. And maybe convenience, if your application can read and write JSON directly.

Separate columns beat a combined json or jsonb column in every performance aspect and in several other aspects, too: More sophisticated type system, the full range of functionality (check, unique, foreign key constraints, default values, etc.), the table is smaller, indexes are smaller, queries are faster.

For prefix matching on text columns you might use a text_pattern_ops index:

Or, more generally, a trigram index supporting any LIKE patterns:

While you stick with JSON (jsonb in particular), there are also different indexing strategies. GIN or Btree is not the only decision to make. Partial indexes, expression indexes, different operator classes (in particular: jsonb_path_ops) Related: