Postgresql – Group and count array elements using intarray

aggregatearraypostgresql

I am working on a Postgres 9.4 project with the intarray extension enabled. We have a table that looks like this:

items
-------------------------------------
id    name                  tag_ids  
--------------------------------------
1     a car                 {1,4}
2     a room to rent        {1}
3     a boat                {1,2,4,11}
4     a wine                {2}
5     emily                 {3}

I'd like to group the tag ids if possible. Like get a count of all the elements that have a tag_id of '{1,2,4,11}'

tag_id  count
1       3
2       2
4       2
11      1

Is this possible? I would think an intersection like this:

select * from items where tag_ids && '{1,2,4,11}'

But I need to group by the array elements inside the intersection result. If I group by tag_ids, it is just the unique value.

How would I do it?

Best Answer

Keep the basic query you already have to cheaply identify rows that have any relevant array elements using an index.

Then unnest only the intersection (tag_ids & '{1,2,4,11}') in a LATERAL join. Finally, aggregate:

SELECT tag_id, count(*) AS ct
FROM   items i
     , unnest(tag_ids & '{1,2,4,11}'::int[]) tag_id
WHERE  tag_ids && '{1,2,4,11}'::int[]
GROUP  BY tag_id
ORDER  BY count(*) DESC;

Once again, the intersection operator & of the intarray module is instrumental.

Previous related answer:

Without intarray

If you don't have intarray installed, or for any other array type, we need another join instead:

SELECT tag_id, count(*) AS ct
FROM  (
   SELECT *
   FROM   items
   WHERE  tag_ids && '{1,2,4,11}'::int[]
   ) i, unnest(tag_ids) t(tag_id)
JOIN unnest('{1,2,4,11}'::int[]) x(tag_id) USING (tag_id)
GROUP BY tag_id
ORDER BY count(*) DESC;

Subtle difference: the intersection operator folds duplicates to produce distinct elements, while this query does not. Only matters if there can be duplicate array elements ...

SQL Fiddle demonstrating both and with an additional row that has a mix of matching and not matching elements to demonstrate the need of the intersection or additional join to eliminate unwanted elements.