Postgresql – Filtering data that could have more than one category

postgresql

I have a table like the following:

Annotation (
    document,
    term,
    category
)

Where document and term are some ID, while category is an integer.

The couple documentterm is not unique, i.e. I could have the same couple with a different category.

document_id_1, term_id_1, category_1
document_id_1, term_id_1, category_2
document_id_1, term_id_1, category_3

I would like to design a query such that it return only the couple documentterm for whom there exists a only a row with category = 1.

E.g. in the previous example the couple document_id_1term_id_1 is not returned becouse there exist also other two rows, with different values of category.

Can you give me some hints on how to do that?

Best Answer

If I understood you correctly, one way to achieve this is getting those rows where category = 1, then check for the non-existence of non-category1 rows:

SELECT document, term
FROM Annotation ann
WHERE 
    category = 1
    AND NOT EXISTS (
        SELECT 1
        FROM Annotation a
        WHERE 
            a.document = ann.document
            AND a.term = ann.term
            AND category <> 1
    )
;

(Thanks to DavideChicco.it for showing that this was overly complicated).