PostgreSQL – Using DISTINCT and ORDER BY in json_agg

jsonpostgresql

I'm trying to get a sorted aggregate of unique values in postgresql. SQL Fiddle

The result should be a json array of unique objects, sorted by index field. The value field can be used as unique key in this case if needed.

The structure and data is similar to this:

CREATE TABLE test (
  index INTEGER,
  value INTEGER,
  tag VARCHAR
);

-- In the real query, this table is the result of a join. 
-- That is the explanation for duplicated rows.
INSERT INTO test VALUES
  (1, 1, 'a'),
  (1, 1, 'a'),
  (1, 1, 'a'),
  (2, 1, 'a'),
  (3, 2, 'a')
;

Getting an aggregate of distinct rows works:

SELECT json_agg(DISTINCT test.*) FROM test GROUP BY tag;
[{"index":1,"value":1,"tag":"a"}, {"index":2,"value":1,"tag":"a"}, {"index":3,"value":2,"tag":"a"}]

Order by without distinct also works:

SELECT json_agg(test.* ORDER BY index) FROM test GROUP BY tag;
[{"index":1,"value":1,"tag":"a"}, {"index":1,"value":1,"tag":"a"}, {"index":1,"value":1,"tag":"a"}, {"index":2,"value":1,"tag":"a"}, {"index":3,"value":2,"tag":"a"}]

The problem happens if using both:

SELECT json_agg(DISTINCT test.* ORDER BY index) FROM test GROUP BY tag;
ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list Position: 42

If I try to add the ORDER BY column I get an error, since json_agg expects only one value:

SELECT json_agg(DISTINCT (SELECT test.index, test.*) ORDER BY index) FROM test GROUP BY tag;
ERROR: subquery must return only one column Position: 28

Best Answer

WITH cte AS (SELECT DISTINCT * FROM test)
SELECT json_agg(cte.* ORDER BY index) FROM cte GROUP BY tag;