PostgreSQL – How to Select Data in the Same Table on Different Rows That Are Empty

countnullpostgresqlselectsqlite

I use this select on my SQLite database:

SELECT attributeX
 FROM table
 GROUP BY attributeX
 HAVING COUNT(attributeY) = 0;

My table:

  attributeX (type integer) |  attributeY (type text)
----------------+-------
  1452 | aaa            
  1452 | bbb
  1452 | 
  1452 | bbb
  1453 |
  1453 |   
  1453 |
  1454 | eee
  1454 | fff
  1455 | iii
      .... 

I get result 1453 only when all items in column attributeY are empty for 1453. So in SQLite I get result 1453.

I have the same database only with myschema in PostgreSQL. I use this SELECT:

SELECT attributeX
 FROM myschema.table
 GROUP BY attributeX
 HAVING COUNT(attributeY) = 0;

and I get:

 attributeX
------------
(0 rows)

How to select in PostgreSQL?

SQLite schema of this table:

CREATE TABLE table (
 attribute1 INTEGER PRIMARY KEY,
 attributeX INTEGER NOT NULL,
 attribute2 TEXT NOT NULL,
 attributeY TEXT
);

CREATE INDEX table_attributeY_idx ON table (attributeY);
CREATE INDEX table_attribute2_idx ON table (attribute2);
CREATE INDEX table_attributeX_idx ON table (attributeX);

PostgreSQL schema of this table:

Table "myschema.table"
        Column  |          Type         |                                    Modifier                             | Storage  | Desc
----------------+-----------------------+-------------------------------------------------------------------------+----------+-------
 attribute1     | integer               | not null implicitly nextval('myschema.table_attribute1_seq'::regclass)  | plain    |
 attributeX     | integer               | not null                                                                | plain    |
 attribute2     | character varying(32) | not null                                                                | extended |
 attributeY     | text                  |                                                                         | extended |
Indexy:
    "table_attribute1_pkey" PRIMARY KEY, btree (attribute1)
    "fki_table_attributeX_fkey" btree (attributeX)
Conditions for foreign key:
    "table_attributeX_fkey" FOREIGN KEY (attributeX) REFERENCES myschema.table2(attributeX2) ON UPDATE RESTRICT ON DELETE CASCADE
Has OID: no

Insert:

INSERT INTO myschema.table (attributeX, attribute2, attributeY)
VALUES (%s, %s,%s)... (python script)

Best Answer

You explain:

I get result 1453 only when all items in column attributeY are empty for 1453.

But that's incorrect. Bold emphasis mine. The aggregate function count returns (per documentation):

number of input rows for which the value of expression is not null

The same is true for SQLite (per documentation):

The count(X) function returns a count of the number of times that X is not NULL in a group.

You obviously have one or more rows with a non-null value in attributeY in your Postgres table - probably an empty string ''.

Test with:

SELECT *
FROM   myschema.table
WHERE  attributeX = 1453
AND    attributeY IS NOT NULL;

Be sure to understand the difference between "empty" ('') and NULL:

Empty strings are character types (typically text, varchar or char) with 0 characters ('') - so basically "nothing", the equivalent of 0 for a numeric data type. NULL is possible for any data type and means "unknown". Some clients have a hard time making the difference clear in their display.

Alternative query

To find attributeX where all attributeY are either empty or NULL use this alternative query:

SELECT attributeX
FROM   myschema.table
GROUP  BY attributeX
HAVING NOT bool_or(attributeY <> '');

The expression attributeY <> '' is only true for non-null, non-empty attributeY. The aggregate function bool_or returns (per documentation):

true if at least one input value is true, otherwise false

Those are the results we exclude (NOT ...) and return the rest. Voilá.

There are many other (less elegant) ways to achieve the same. Like:

... HAVING count(CASE WHEN attributeY <> '' THEN 1 END) = 0;
... HAVING count(attributeY <> '' OR NULL) = 0;
... HAVING count(NULLIF(attributeY, '') = 0;

More: