PostgreSQL – Efficient Ways to Count Number of Rows in a Table

countperformancepostgresqlquery-performance

I am using Postgres with the following query:

select count(*) from image;

The primary key on this table is non-incrementing; it's a unique serial number for the images stored in the table. Our app often attempts to ingest images that have already been recorded in the database, so the primary key/serial number ensures they are only recorded once.

Now we are wondering if we should have gone with an incrementing primary key instead. We have 1,259,369 images in the database and it takes about 7 minutes for the count query to run.

Our app will never delete images from this table – so an incrementing primary key would allow us to check the value of the last ID which would equal the number of rows in the table.

Best Answer

Generally, if you don't need an exact count, there is a much faster way:

SELECT reltuples::bigint AS estimate
FROM   pg_class
WHERE  oid = 'image'::regclass;

As a matter of fact, in a DB with concurrent write access every count is an estimate, because the number may be outdated the instant you get it.

But, like @a_horse commented, there is something off in your DB. Counting a million should not take more than a few seconds in the worst case.

That your app will never delete images from this table makes this even more suspicious, because there shouldn't be many dead rows then. (Or are you updating a lot?) A huge amount of dead tuples could slow you down - and call for VACUUM. Normally, autovacuum takes care of this. Did you enable it? (It's the default in modern Postgres.)

Check for dead tuples:

All the usual advice for performance optimization applies.