I'm struggling to get data from a JSON column in my PostgreSQL database.
In our users
table, we have a known_ips
column, which is a JSON column, holding a flat array of IP addresses known for the given user, like this:
# select email, known_ips from users limit 3;
email | known_ips
-------------------+-------------------------------
user1@example.com | ["192.168.1.1","192.168.1.2"]
user2@example.com | ["192.168.1.3"]
user3@example.com | ["192.168.1.2"]
(3 rows)
What I'm trying to do is to select all the users that have an IP address from an array of IP addresses (in this example, either 192.168.1.1 or 192.168.1.2) in their known_ips
column. So in this case, user1 and user3 have IP address .1 or .2, so I want to have them returned, but not user2, since I'm not looking for the .3 address.
I've tried the following queries, all to no avail:
-
database=# select email, ip from users, json_array_elements(known_ips) as ip where ip in("192.168.1.1","192.168.1.2"); ERROR: column "192.168.1.1" does not exist LINE 1: ... json_array_elements(known_ips) as ip where ip in("192.168.1... ^
-
database=# select email, ip from users, json_array_elements(known_ips) as ip where ip in('192.168.1.1','192.168.1.2') limit 5; ERROR: invalid input syntax for type json LINE 1: ... json_array_elements(known_ips) as ip where ip in('192.168.1... ^ DETAIL: Token "." is invalid. CONTEXT: JSON data, line 1: 192.168....
-
database=# select email, ip from users, json_array_elements(known_ips) as ip where ip in(192.168.1.1,192.168.1.2) limit 5; ERROR: syntax error at or near ".168" LINE 1: ...array_elements(known_ips) as ip where ip in(192.168.1.1,192... ^
-
database=# select email, ip from users, json_array_elements(known_ips) as ip where ip in("192.168.1.1","192.168.1.2"); ERROR: column "192.168.1.1" does not exist LINE 1: ... json_array_elements(known_ips) as ip where ip in("192.168.1... ^
-
database=# select email, ip from users, json_array_elements(known_ips) as ip where ip in(array["192.168.1.1","192.168.1.2"]); ERROR: column "192.168.1.1" does not exist LINE 1: ...array_elements(known_ips) as ip where ip in(array["192.168.1... ^
-
database=# select email, ip from users, json_array_elements(known_ips) as ip where ip = any(array["192.168.1.1","192.168.1.2"]); ERROR: column "192.168.1.1" does not exist LINE 1: ...ay_elements(known_ips) as ip where ip = any(array["192.168.1... ^
-
database=# select email, ip from users, json_array_elements(known_ips) as ip where ip in(unnest(array["192.168.1.1","192.168.1.2"])); ERROR: column "192.168.1.1" does not exist LINE 1: ...lements(known_ips) as ip where ip in(unnest(array["192.168.1... ^
-
database=# select email, ip from users, json_array_elements(known_ips) as ip where ip = any(array["192.168.1.1","192.168.1.2"]); ERROR: column "192.168.1.1" does not exist LINE 1: ...ay_elements(known_ips) as ip where ip = any(array["192.168.1... ^
I'm feeling like I'm missing something obvious, but I just don't see it. Why does Postgres think that I'm defining columns instead of strings? How do I properly get the data?
Best Answer
Double quotation marks are name delimiters. They are reserved for delimiting names (column names, table names etc.) that contain non-standard characters or those that you want to explicitly make case-sensitive (because that is their effect in PostgreSQL, which is according to the standard, too).
So, that is why all attempts with
"192.168.1.1"
fail: PostgreSQL indeed interprets those as names (specifically column names in each of those contexts).The one case without quotes fails simply because
192.168.1.1
is an invalid token sequence. Numbers and some other constants can be represented in PostgreSQL without quotation marks but the tokens you specify there cannot be interpreted either as a number or anything else.Finally, the one where you are using single quotation marks around the IPs fails because PostgreSQL is trying to interpret those as JSON literals. Why? Because the
ip
column is of typejson
– that is the type of column values returned byjson_array_elements
.So, in order for that second attempt of yours to succeed, you should first of all represent the IPs as valid JSON string items. That means you need to enclose them in double quotation marks and then in single quotation marks, like this:
However, that will give you this error:
Your options are:
convert
ip
totext
:convert
ip
tojsonb
Either should get you going.
Note, though, that filtering your data like that can give you duplicates in the output. The issue is, the
json_array_elements
function turns the specifiedjson
value into a row set, repeating the source row's columns for each transposed item. So, for your example the FROM clause effectively produces the following row set:Since for user1 each
ip
will match the IN predicate, you will get the corresponding email returned twice.To resolve that, instead of this:
you can do something like this: