Postgresql – Very slow simple JOIN query

performancepostgresqlpostgresql-9.6query-performance

Simple DB structure (for an online forum):

CREATE TABLE users (
    id integer NOT NULL PRIMARY KEY,
    username text
);
CREATE INDEX ON users (username);

CREATE TABLE posts (
    id integer NOT NULL PRIMARY KEY,
    thread_id integer NOT NULL REFERENCES threads (id),
    user_id integer NOT NULL REFERENCES users (id),
    date timestamp without time zone NOT NULL,
    content text
);
CREATE INDEX ON posts (thread_id);
CREATE INDEX ON posts (user_id);

Around 80k entries in users and 2,6 million entries in posts tables. This simple query to get top 100 users by their posts takes 2,4 seconds:

EXPLAIN ANALYZE SELECT u.id, u.username, COUNT(p.id) AS PostCount FROM users u
                    INNER JOIN posts p on p.user_id = u.id
                    WHERE u.username IS NOT NULL
                    GROUP BY u.id
ORDER BY PostCount DESC LIMIT 100;
Limit  (cost=316926.14..316926.39 rows=100 width=20) (actual time=2326.812..2326.830 rows=100 loops=1)
  ->  Sort  (cost=316926.14..317014.83 rows=35476 width=20) (actual time=2326.809..2326.820 rows=100 loops=1)
        Sort Key: (count(p.id)) DESC
        Sort Method: top-N heapsort  Memory: 32kB
        ->  HashAggregate  (cost=315215.51..315570.27 rows=35476 width=20) (actual time=2311.296..2321.739 rows=34608 loops=1)
              Group Key: u.id
              ->  Hash Join  (cost=1176.89..308201.88 rows=1402727 width=16) (actual time=16.538..1784.546 rows=1910831 loops=1)
                    Hash Cond: (p.user_id = u.id)
                    ->  Seq Scan on posts p  (cost=0.00..286185.34 rows=1816634 width=8) (actual time=0.103..1144.681 rows=2173916 loops=1)
                    ->  Hash  (cost=733.44..733.44 rows=35476 width=12) (actual time=15.763..15.763 rows=34609 loops=1)
                          Buckets: 65536  Batches: 1  Memory Usage: 2021kB
                          ->  Seq Scan on users u  (cost=0.00..733.44 rows=35476 width=12) (actual time=0.033..6.521 rows=34609 loops=1)
                                Filter: (username IS NOT NULL)
                                Rows Removed by Filter: 11335

Execution time: 2301.357 ms

With set enable_seqscan = false even worse:

Limit  (cost=1160881.74..1160881.99 rows=100 width=20) (actual time=2758.086..2758.107 rows=100 loops=1)
  ->  Sort  (cost=1160881.74..1160970.43 rows=35476 width=20) (actual time=2758.084..2758.098 rows=100 loops=1)
        Sort Key: (count(p.id)) DESC
        Sort Method: top-N heapsort  Memory: 32kB
        ->  GroupAggregate  (cost=0.79..1159525.87 rows=35476 width=20) (actual time=0.095..2749.859 rows=34608 loops=1)
              Group Key: u.id
              ->  Merge Join  (cost=0.79..1152157.48 rows=1402727 width=16) (actual time=0.036..2537.064 rows=1910831 loops=1)
                    Merge Cond: (u.id = p.user_id)
                    ->  Index Scan using users_pkey on users u  (cost=0.29..2404.83 rows=35476 width=12) (actual time=0.016..41.163 rows=34609 loops=1)
                          Filter: (username IS NOT NULL)
                          Rows Removed by Filter: 11335
                    ->  Index Scan using posts_user_id_index on posts p  (cost=0.43..1131472.19 rows=1816634 width=8) (actual time=0.012..2191.856 rows=2173916 loops=1)
Planning time: 1.281 ms
Execution time: 2758.187 ms

Group by username is missing in Postgres, because it's not required (SQL Server says I have to group by username if I want to select username). Grouping with username adds a little bit of ms to execution time on Postgres or does nothing.

For science, I've installed Microsoft SQL Server to the same server (which runs archlinux, 8 core xeon, 24 gb ram, ssd) and migrated all the data from Postgres – same table structure, same indices, same data. Same query to get top 100 posters runs in 0,3 seconds:

SELECT TOP 100 u.id, u.username, COUNT(p.id) AS PostCount FROM dbo.users u
                    INNER JOIN dbo.posts p on p.user_id = u.id
                    WHERE u.username IS NOT NULL
                    GROUP BY u.id, u.username
ORDER BY PostCount DESC

Yields same results from the same data, but does it 8 times faster. And it's beta version of MS SQL on Linux, I guess running on it's "home" OS – Windows Server – it could be faster still.

Is my PostgreSQL query totally wrong, or is PostgreSQL just slow?

Additional info

Version is almost the newest (9.6.1, currently newest is 9.6.2, ArchLinux just has outdated packages and is very slow to update). Config:

max_connections = 75
shared_buffers = 3584MB       
effective_cache_size = 10752MB
work_mem = 24466kB         
maintenance_work_mem = 896MB   
dynamic_shared_memory_type = posix  
min_wal_size = 1GB
max_wal_size = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100

EXPLAIN ANALYZE outputs: https://pastebin.com/HxucRgnk

Tried all the indexes, used even GIN and GIST, fastest way for PostgreSQL (and Googling confirms with many rows) is to use sequential scan.

MS SQL Server 14.0.405.200-1, default conf.

I use this in an API (with plain select without analyze), and calling this API endpoint with chrome it says it takes 2500 ms +-, add 50 ms of HTTP and web server overhead overhead (API and SQL run on the same server) – it's the same. I do not care about 100 ms here or there, what I care about is two whole seconds.

explain analyze SELECT user_id, count(9) FROM posts group by user_id; takes 700 ms. Size of posts table is is 2154 MB.

Best Answer

Another good query variant is:

SELECT p.user_id, p.cnt AS PostCount
FROM users u
INNER JOIN (
    select user_id, count(id) as cnt from posts group by user_id
) as p on p.user_id = u.id
WHERE u.username IS NOT NULL          
ORDER BY PostCount DESC LIMIT 100;

It doesn't exploit CTE and gives correct answer (and CTE example may produce less than 100 rows in theory cause it first limits then joins with users).

I suppose, MSSQL able to perform such transformation in its query optimizer, and PostgreSQL is not able to push down aggregation under join. Or MSSQL just have much faster hash join implementation.