PostgreSQL – json_object_agg Errors on Null Field Name

aggregatejsonpostgresqlpostgresql-10postgresql-11

PostgreSQL versions: my local install 11.3 and the below fiddle is on 10.0. Both behave the same.

I have a schema of pages, each page has sections and each section can have different kind of contents. When I query for a page, I wish to output all there is about that page in a JSON document.

I'm using CTE's to json_agg() various contents per section. Finally I join the sections into json_object_agg() to map section titles to section contents.

The problem: json_object_agg() throws an error when a page does not have any sections. I've verified guiltiness by using a regular json_agg() without section titles. The exact error:

error: field name must not be null

What I want: In any case not an error. I don't want to do custom error handling on the receiver side. It would be even better if the query can return a JSON Null in place of the json_object_agg() in case of no sections, but that's optional. (Other elegant solutions welcome)

The docs

Probably the documentation is incomplete or I missed something. Just for reference.

On aggregate expressions it says (emphasis mine):

Most aggregate functions ignore null inputs, so that rows in which one
or more of the expression(s) yield null are discarded. This can be
assumed to be true, unless otherwise specified, for all built-in
aggregates.

And in aggregate functions, on json_object_agg() no remark about not handling null:

aggregates name/value pairs as a JSON object

Fiddle with faulty domain argument. Changing the domain to the other options makes it work fine. Also using a non-existing domain works fine and returns 0 rows.

Query

with secs as (
    select p.page_id, p.domain, s.section_id as sid, s.title as title
    from pages p
    left join sections s on p.page_id = s.page_id
    where p.domain = 'bar.com'
),
txt as (
    select
        sid,
        json_agg(
            json_build_object(
                'Pos', pos,
                'Text', content
            )
            order by pos asc
        ) as txts
    from texts
    join secs on sid = section_id
    group by sid
),
img as (
    select
        sid,
        json_agg(
            json_build_object(
                'Pos', pos,
                'Image', image
            )
            order by pos asc
        ) as imgs
    from images
    join secs on sid = section_id
    group by sid
)
select
    json_build_object(
        'ID', s.page_id,
        'Domain', domain,
        'Sections', json_object_agg ( -- Error occurs here
            s.title,
            json_build_object(
                'ID', s.sid,
                'Texts', t.txts,
                'Images', i.imgs
            )
            order by s.sid asc
        )
    )
from secs s
left join txt t on s.sid = t.sid
left join img i on s.sid = i.sid
group by s.page_id, domain;

Schema

create table pages (
    page_id serial primary key,
    domain text unique not null
);

create table sections (
    section_id serial primary key,
    title text not null,
    page_id int references pages
);

create table texts (
    section_id int references sections,
    pos int not null,
    content text not null,
    primary key (section_id, pos)
);

create table images (
    section_id int references sections,
    pos int not null,
    image text not null,
    primary key (section_id, pos)
);

-- spanac.com will have 3 sections with texts and images in each, various amounts
insert into pages (domain) values ('spanac.com');
-- foo.com has 1 empty section
insert into pages (domain) values ('foo.com');
-- bar.com has no sections
insert into pages (domain) values ('bar.com');

-- spanac.com

with s as (
    insert into sections (page_id, title) select page_id, 'first' from pages where domain = 'spanac.com' returning section_id
),
t1 as (
    insert into texts (section_id, pos, content) select section_id, 1, 'spanac one.one' from s
),
t2 as (
    insert into texts (section_id, pos, content) select section_id, 2, 'spanac one.two' from s
),
i1 as (
    insert into images (section_id, pos, image) select section_id, 1, 's11.jpg' from s
)
insert into images (section_id, pos, image) select section_id, 2, 's12.jpg' from s;

with s as (
    insert into sections (page_id, title) select page_id, 'second' from pages where domain = 'spanac.com' returning section_id
),
t1 as (
    insert into texts (section_id, pos, content) select section_id, 1, 'spanac two.one' from s
),
t2 as (
    insert into texts (section_id, pos, content) select section_id, 2, 'spanac two.two' from s
),
i1 as (
    insert into images (section_id, pos, image) select section_id, 1, 's21.jpg' from s
)
insert into images (section_id, pos, image) select section_id, 2, 's22.jpg' from s;

with s as (
    insert into sections (page_id, title) select page_id, 'third' from pages where domain = 'spanac.com' returning section_id
),
t1 as (
    insert into texts (section_id, pos, content) select section_id, 1, 'Spanac three.one' from s
)
insert into images (section_id, pos, image) select section_id, 1, 's31.jpg' from s;

-- foo.com

insert into sections (page_id, title) select page_id, 'empty' from pages where domain = 'foo.com';

Best Answer

It sounds like you've found a bug.

You can report it at the bottom of their home page, and monitor the discussion of the bug on the pgsql-bugs mailing list, I don't expect them to change the way postgresql handles this data, they will probably just correct the documentation.

A possible work-around would be to add WHERE s.title IS NOT NULL to the agregating query

select
    json_build_object(
        'ID', s.page_id,
        'Domain', domain,
        'Sections', json_object_agg (
            s.title,
            json_build_object(
                'ID', s.sid,
                'Texts', t.txts,
                'Images', i.imgs
            )
            order by s.sid asc
        )
    )
from secs s
left join txt t on s.sid = t.sid
left join img i on s.sid = i.sid
WHERE s.title IS NOT NULL -- prevent null title from being agregated.
group by s.page_id, domain;