Postgresql – How to create an index on an integer json property in postgres

castindexjsonpostgresqlpostgresql-9.3

I can't figure out for the life of me how to create an index on a property of my json column which is an integer.

I tried it this way (and also dozens of others)

CREATE INDEX user_reputation_idx ON users(("user"->>'reputation')::int)

It works just fine in a query (e.g. ORDER BY ("user"->>'reputation')::int)

What am I missing?

UPDATE

I'm getting a simple syntax error, however, I really have no clue why.

ERROR:  syntax error at or near "::"
LINE 1: ... user_reputation_idx ON users (("user"->>'reputation')::int)

The table definition is quite simple. It's just one column user of typejson.

So, looks like this:

CREATE TABLE users
(
  "user" json
)

Best Answer

Try this instead:

CREATE INDEX user_reputation_idx ON users(cast("user"->>'reputation' AS int));

The Postgres syntax shortcut :: for casts is not allowed without additional parentheses in an index definition (see @bma's comment). It works with the standard SQL function, though: cast(expression AS type) This is not related to the json type per se.

Either way, you can still use the syntax shortcut expression::type in expressions that utilize the index.

Related Solutions

Sql-server – Indexing – Uniqueidentifier Foreign Key or Intermediary mapping table

Ok, I am making a lot of assumptions (INT instead of VARCHAR(50) being one of them) with this answer, so feel free to correct me if needed. The problem with option B is that it introduces a new join to relate Users to Alerts without any real added benefit. If joining on the UserID, it is best to index the UserID, so you can utilize seeks for your joins.

For Option A, UserID will be the clustering key (index key for the clustered index) on the Users table. UserID will be a nonclustered index key on Alerts table. This will cost 16 bytes per Alert.

For Option B, UserID will be the clustering key on the Users table. UserId will probably be the clustering key in UserMap too, to make joining more efficient. UserKey (assuming this is an INT) would then be a nonclustered index key on the Alerts table. This will cost 4 bytes per Alert. And 20 bytes per UserMap.

Looking at the big picture, one relationship, for Option A, costs 16 bytes of storage, and involves 1 join operation. Whereas, one relationship, for Option B, costs 24 bytes of storage, and involves 2 join operations.

Furthermore, there are a possibility of 340,282,366,920,938,000,000,000,000,000,000,000,000 uniqueidentifiers and only 4,294,967,296 INTs. Implementing a uniqueidentifier to INT map for a this type of relationship could cause unexpected results when you start reusing INTs.

The only reason for creating this type map table, is if you plan on creating a Many to Many relationship between Users and Alerts.

Taking all of this into consideration, I would recommend Option A.

I hope this helps,

Matt

Postgresql – Querying JSONB in PostgreSQL

This does not work because it's trying to cast a jsonb value to integer.

select data->'name' as name from persons where cast(data->'age' as int) > 25

This would actually work:

SELECT data->'name' AS name FROM persons WHERE cast(data->>'age' AS int) > 25;

Or shorter:

SELECT data->'name' AS name FROM persons WHERE (data->>'age')::int > 25;

And this:

SELECT data->'name' AS name FROM persons WHERE data->>'name' > 'Jenny';

Seems like confusion with the two operators -> and ->> and operator precedence. The cast :: binds stronger than the json(b) operators.

Figure out type dynamically

This is the more interesting part of your question:

the type of age in the JSON document is number anyway, so why can't PostgreSQL figure out that by itself?

SQL is a strictly typed language, it does not allow the same expression to evaluate to integer in one row and to text in the next. But since you are only interested in the boolean result of the test, you can get around this restriction with a CASE expression that forks depending on the result of jsonb_typeof():

SELECT data->'name'
FROM   persons
WHERE  CASE jsonb_typeof(data->'age')
        WHEN 'number'  THEN (data->>'age')::numeric > '25' -- treated as numeric
        WHEN 'string'  THEN data->>'age' > 'age_level_3'   -- treated as text
        WHEN 'boolean' THEN (data->>'age')::bool           -- use boolean directly (example)
        ELSE FALSE                                         -- remaining: array, object, null
       END;

An untyped string literal to the right of the > operator is coerced to the respective type of the value to the left automatically. If you put a typed value there, the type has to match or you have to cast it explicitly - unless there is adequate implicit cast registered in the system.

If you know that all numeric values are actually integer, you can also:

... (data->>'age')::int > 25 ...

Best Answer

Related Solutions

Sql-server – Indexing – Uniqueidentifier Foreign Key or Intermediary mapping table

Postgresql – Querying JSONB in PostgreSQL

Figure out type dynamically

Related Question