Postgresql – Using postgres to select a filtered array

arraypostgresqlquery

I'm looking to select data from a table that contains an ID, and an array of values. But in the query results, i want to filter the array and remove values that overlap with a white list.

For illustrative purposes, the following query shows how the data would be formatted with a whitelist of values in a single column, and then a table i want to select from… that contains a unique ID, and the array of keywords.

with whitelist as (
    select distinct unnest(array['a','b','c','f','h']) as keywords
), raw_with_keyword_array as (
    select * from  (
        values 
            ('1234', array['a','c','e','h','q']),
            ('1235', array['a','d','e','f']) 
    ) as t (id, keyword_array)
)
select...

The expected results i'm looking for is two rows from the raw_with_keyword_array table, but filtered arrays not including the values from the whitelist.

I figure i probably need to select the data from the main table, and then do some kind of recursive sub query to filter out the whitelisted values, and put the results back into a new array?

Any suggestions would be appreciated!

E.g.

1234 {e, q}

1235 {d, e}

Best Answer

Not sure if this is the most efficient way to do it, but this works.

You do have to unnest all elements from keyword_array to be able to filter the unwanted out:

with whitelist (kw) as (
    values ('a'),('b'),('c'),('f'),('h')
), raw_with_keyword_array (id, keyword_array) as (
  values 
      ('1234', array['a','c','e','h','q']),
      ('1235', array['a','d','e','f']) 
)
select id, 
       array(select kw
             from unnest(ra.keyword_array) as t(kw)
             except 
             select kw
             from whitelist)
from raw_with_keyword_array ra;

Instead of except you could also use a where not exists:

....
select id, 
       array(select kw
             from unnest(ra.keyword_array) as t(kw)
             where not exists (select *
                               from whitelist wl
                               where wl.kw = t.kw)
             )
from raw_with_keyword_array ra;

I don't think it makes a big difference though.

Online example