PostgreSQL 9.6 performance problem with Sum() aggregate query

aggregateperformancepostgresqlpostgresql-9.6postgresql-performance

I have the following table:

     id                | bigint                      | not null default nextval('shares_id_seq'::regclass)
     poolid            | text                        | not null
     blockheight       | bigint                      | not null
     networkdifficulty | double precision            | not null
     miner             | text                        | not null
     worker            | text                        |
     ipaddress         | text                        | not null
     created           | timestamp without time zone | not null
     useragent         | text                        |
     payoutinfo        | text                        |
     difficulty        | double precision            | not null default 0

Indexes:
        "shares_pkey" PRIMARY KEY, btree (id)
        "idx_shares_pool_block" btree (poolid, blockheight)
        "idx_shares_pool_created" btree (poolid, created)
        "idx_shares_pool_miner" btree (poolid, miner)
        "idx_shares_pool_miner_diff" btree (poolid, miner, difficulty)

I cannot wrap my head around why the following query takes so long:

explain analyze SELECT SUM(difficulty) FROM shares WHERE poolid = 'xmr1' AND miner = '4BCeEPhodgPMbPWFN1dPwhWXdRX8q4mhhdZdA1dtSMLTLCEYvAj9QXjXAfF7CugEbmfBhgkqHbdgK9b2wKA6nqRZQCgvCDm';
                                                                                QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=4150712.54..4150712.55 rows=1 width=8) (actual time=25490.101..25490.101 rows=1 loops=1)
   ->  Bitmap Heap Scan on shares  (cost=389414.64..4143195.97 rows=3006629 width=8) (actual time=2499.409..24815.011 rows=7445802 loops=1)
         Recheck Cond: ((poolid = 'xmr1'::text) AND (miner = '4BCeEPhodgPMbPWFN1dPwhWXdRX8q4mhhdZdA1dtSMLTLCEYvAj9QXjXAfF7CugEbmfBhgkqHbdgK9b2wKA6nqRZQCgvCDm'::text))
         Rows Removed by Index Recheck: 55232916
         Heap Blocks: exact=149273 lossy=2641988
         ->  Bitmap Index Scan on idx_shares_pool_miner  (cost=0.00..388662.98 rows=3006629 width=0) (actual time=2449.977..2449.977 rows=7445802 loops=1)
               Index Cond: ((poolid = 'xmr1'::text) AND (miner = '4BCeEPhodgPMbPWFN1dPwhWXdRX8q4mhhdZdA1dtSMLTLCEYvAj9QXjXAfF7CugEbmfBhgkqHbdgK9b2wKA6nqRZQCgvCDm'::text))
 Planning time: 0.256 ms
 Execution time: 25490.137 ms
(9 rows)

I specifically created index idx_shares_pool_miner_diff for this scenario, yet it isn't even used. Can't PostgreSQL use the index due to MVCC?

Update:

Running vacuum analyze shares as suggested made PostgreSQL use the aforementioned index:

 Aggregate  (cost=546165.94..546165.95 rows=1 width=8) (actual time=2489.446..2489.447 rows=1 loops=1)
   ->  Index Only Scan using idx_shares_pool_miner_diff on shares  (cost=0.69..537874.79 rows=3316458 width=8) (actual time=0.041..1924.396 rows=7461785 loops=1)
         Index Cond: ((poolid = 'xmr1'::text) AND (miner = '4BCeEPhodgPMbPWFN1dPwhWXdRX8q4mhhdZdA1dtSMLTLCEYvAj9QXjXAfF7CugEbmfBhgkqHbdgK9b2wKA6nqRZQCgvCDm'::text))
         Heap Fetches: 16575
 Planning time: 0.122 ms
 Execution time: 2489.477 ms
(6 rows)

I assume that's as fast as it gets without using parallel workers.

After reading the docs about index-only-scans, it looks like I'm taking a performance hit due to the constant heavy write activity on the shares table (> 200 inserts/sec), causing the many Heap-Fetches. Interestingly the number of Heap-Fetches is apparently only going up, never down.

Best Answer

If your table is INSERT-only, there are ways to get your sums (much) faster.

Assuming there is a column with monotonically increasing values (like id or created in your example), create a MATERIALZED VIEW to pre-compute sums older than a (recent) given threshold. And then just add the sum of recent additions to it:

CREATE MATERIALIZED VIEW shares_summed AS
SELECT poolid, miner, SUM(difficulty) AS sum_diff
FROM   shares
GROUP  BY  poolid, miner
ORDER  BY  poolid, miner;  -- optional, but to optimize some more
WHERE  created <  '2018-01-01 0:0';

If there are many combinations of (poolid, miner), add a UNIQUE index on that. Also required to refresh CONCURRENTLY, see below.

Fetching a row from that table cost next to nothing. Then you only add recent additions to it:

SELECT sum(sum_diff) AS total_sum  -- takes care of possible missing rows
FROM  (
   SELECT sum_diff
   FROM   shares_summed
   WHERE  poolid = 'xmr1'
   AND    miner = '4BCeEPhod...'

   UNION ALL
   SELECT SUM(difficulty)
   FROM   shares
   WHERE  poolid = 'xmr1'
   AND    miner  = '4BCeEPhod...'
   AND    created >= '2018-01-01 0:0'
   ) sub;                                                   

You only have to adapt the threshold value from time to time and REFRESH the MV. Best with the CONCURRENTLY option, to avoid concurrency issues with your many INSERTs. Consider:

Remember the value and adapt queries accordingly. You might store it in another table.

To make it even faster, create a partitioned table and add another partition with pre-computed sums after every month (or whatever). You can easily schedule a monthly cron job to automate this. Then the threshold is always the start of the month or so:

SELECT sum(sum_diff) AS total_sum
FROM  (
   SELECT SUM(difficulty) AS sum_diff
   FROM   shares_summed_master  -- includes all partitions
   WHERE  poolid = 'xmr1'
   AND    miner = '4BCeEPhod...'

   UNION ALL
   SELECT SUM(difficulty)
   FROM   shares
   WHERE  poolid = 'xmr1'
   AND    miner  = '4BCeEPhod...'
   AND    created >= date_trunc('month', now())  -- careful, current time zone affects it
   ) sub;           

Make sure there is no overlap, and pre-computed sums are current.

Table partitioning was largely improved and simplified with Postgres 10. Be sure to run the latest point release, some corner case bugs were fixed in Postgres 10.2. The above link to the manual is for version 9.6. Current version here.