Postgresql – Postgres HAVING ilike any wildcard in array_agg

aggregatelikepattern matchingpostgresqlquery

I need a filter where the input value is similar to any value in an aggregate, using ilike and wildcards. My attempt at a query so far is:

SELECT p.id, p.name, array_agg(vo.name)
FROM product p
LEFT JOIN product_variation pv
ON p.id = pv.product_id
LEFT JOIN variation_option vo
ON pv.variation_option_id = vo.id
GROUP BY p.id
HAVING $1 ilike any(array_agg('%' || vo.name || '%'));

But it doesn't work. It gets even more complicated: ideally, I'd be able to input an array of strings, where any of them are similar to the aggregate. Any tips?

I realize this kind of works, just not as I need it to. The input value should be able to be a partial match for any of the array elements, but it is the other way around. The wild card would need to be in the input value.

Best Answer

... but it is the other way around. The wild card would need to be in the inputted value.

Yes, it's backwards, but ILIKE (internally operator ~~*) nor the alternative regexp operator ~* have commutators. I.e., you can't turn the expression around, can't switch left and right operand. See:

SELECT * FROM pg_operator WHERE oprname  IN ('~~*', '~*'); 

oprcom = 0 means no commutator. Details in the manual.

The right operand of (I)LIKE is a pattern, so switching is impossible. And the ANY construct only goes one way. Dead end. See:

You can still turn it around using a different tool: the aggregate function bool_or():

SELECT p.id, p.name, array_agg(vo.name)
FROM   product           p
JOIN   product_variation pv ON p.id = pv.product_id
JOIN   variation_option  vo ON pv.variation_option_id = vo.id
GROUP  BY p.id
HAVING bool_or(vo.name ILIKE ('%' || $1 || '%'));

Also using plain JOIN instead of LEFT JOIN, which would be pointless in view of the later filter.

ideally, I'd be able to input an array of strings

Now that's also possible for an input array:

...
HAVING bool_or(vo.name ~* ANY($1));

$1 being of type text[] or an array literal like '{foo, bar}'. I chose the regexp operator as we don't need leading and trailing % with it.

Since $1 is a pattern (or an array of patterns), as mentioned above, you may want to escape characters with special meaning (per array element). See:

If your input is selective, meaning only very few products actually qualify, then an alternative formulation with EXISTS might be faster - in combination with a trigram index on vo.name. See:

Index:

CREATE INDEX variation_option_name_gin_trgm_idx ON variation_option USING gin (name gin_trgm_ops);

Query:

SELECT p.id, p.name, array_agg(vo.name)
FROM   product           p
JOIN   product_variation pv ON p.id = pv.product_id
JOIN   variation_option  vo ON pv.variation_option_id = vo.id
WHERE  EXISTS (
   SELECT -- list can be empty
   FROM   variation_option vo1
   WHERE  vo1.id = pv.variation_option_id
   AND    vo1.name ILIKE ('%' || $1 || '%')
-- AND    vo1.name ~* ANY($1)  -- for array input
   )
GROUP  BY p.id;

Related: