Postgresql – Poor postgres performance when deleting rows with large BLOBs

performancepostgresqlpostgresql-performance

I have a PostgreSQL table like

CREATE TABLE stuff (id BIGSERIAL, data bytea, info TEXT)

the data blob is typically in the 1 .. 10 MB range per row. The info text is just a few characters. The table contains about 35000 rows currently.

Deleting 50 of these rows is taking about 15 .. 20 seconds on three different systems. Tested with Postgres 8.something on Ubuntu 16.04, 9.6.11 on Windows 7, and 10.6 on Ubuntu 18.04. No modifications were done to any configuration files related to performance tweaking, everything is exactly as it is after a fresh install on Ubuntu 16.04, Windows, and 18.04 respectively except for host and port info and such. The servers were not under any load when the tests were performed.

I have noticed that deletion is almost instantaneous if the data blob is either null or very small (just a few bytes). In fact, it seems that the time it takes to delete is proportional to the amount of data deleted.

Deletion is done like

DELETE FROM stuff
WHERE stuff.id IN (
    SELECT stuff_id
    FROM somewhere_else
    ORDER BY somewhere_else.id ASC
    LIMIT 50
)

I first suspected that the subquery causes the problem, but that does not seem to be the case.

Selecting the data like SELECT (id, info) FROM stuff WHERE stuff.id IN (...) is almost instantaneous as well (10..100 ms), SELECT * FROM stuff WHERE stuff.id IN (...) took about 25 seconds.

So the time it takes to actually load all the data is in the same order of magnitude as the time it takes to delete the rows, while leaving out the large blob speeds up the process significantly. Therefore I have ruled out that the subquery is the problem.

Why does deletion of large blobs take that long? I assumed that, since VACUUM exists, deletion should be extremely quick regardless of the amount of data in a row, and the actual freeing resources is performed during VACUUM. Is this not the case?


As requested, I ran some EXPLAIN (ANALYZE, BUFFERS) commands. These commands have been executed in order while the server was not doing anything else. It seems like selecting all data has a positive effect on a subsequent DELETE. However, after a DELETION it reverts back to the performance profile I know – somewhere around 10 .. 20 seconds.


// 1. EXPLAIN (ANALYZE, BUFFERS) SELECT

EXPLAIN (ANALYZE, BUFFERS) SELECT 
FROM stuff
WHERE stuff.id IN 
(
    SELECT somewhere_else.stuff_id 
    FROM somewhere_else
    WHERE somewhere_else.stuff_id IS NOT NULL
    ORDER BY somewhere_else.id ASC
    LIMIT 50
)

(somewhere_else.stuff_id is a foreign key referencing stuff.id with ON DELETE SET NULL)

Nested Loop  (cost=2.94..374.55 rows=50 width=0) (actual time=38.829..38.891 rows=50 loops=1)
  Buffers: shared hit=237 read=9
  ->  HashAggregate  (cost=2.65..3.15 rows=50 width=8) (actual time=38.752..38.757 rows=50 loops=1)
        Group Key: somewhere_else.stuff_id
        Buffers: shared hit=88 read=8
        ->  Limit  (cost=0.29..2.03 rows=50 width=16) (actual time=38.718..38.733 rows=50 loops=1)
              Buffers: shared hit=88 read=8
              ->  Index Scan using somewhere_else_pkey on somewhere_else  (cost=0.29..1077.19 rows=30926 width=16) (actual time=38.716..38.727 rows=50 loops=1)
                    Filter: (stuff_id IS NOT NULL)
                    Rows Removed by Filter: 258
                    Buffers: shared hit=88 read=8
  ->  Index Only Scan using stuff_pkey on stuff  (cost=0.29..7.43 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=50)
        Index Cond: (id = somewhere_else.stuff_id)
        Heap Fetches: 50
        Buffers: shared hit=149 read=1
Planning time: 21.320 ms
Execution time: 38.968 ms

// 2. SELECT *

SELECT * -> 25.0 secs

// 3. SELECT id

SELECT id -> 72 msec

// 4. EXPLAIN (ANALYZE, BUFFERS) DELETE -> (5218.935 ms)

Delete on stuff  (cost=2.94..374.55 rows=50 width=38) (actual time=4975.223..4975.223 rows=0 loops=1)
  Buffers: shared hit=129537 read=26057 dirtied=25804 written=6632
  ->  Nested Loop  (cost=2.94..374.55 rows=50 width=38) (actual time=0.114..0.789 rows=50 loops=1)
        Buffers: shared hit=220
        ->  HashAggregate  (cost=2.65..3.15 rows=50 width=40) (actual time=0.106..0.238 rows=50 loops=1)
              Group Key: "ANY_subquery".stuff_id
              Buffers: shared hit=70
              ->  Subquery Scan on "ANY_subquery"  (cost=0.29..2.53 rows=50 width=40) (actual time=0.075..0.094 rows=50 loops=1)
                    Buffers: shared hit=70
                    ->  Limit  (cost=0.29..2.03 rows=50 width=16) (actual time=0.072..0.083 rows=50 loops=1)
                          Buffers: shared hit=70
                          ->  Index Scan using somewhere_else_pkey on somewhere_else  (cost=0.29..1077.19 rows=30926 width=16) (actual time=0.071..0.079 rows=50 loops=1)
                                Filter: (stuff_id IS NOT NULL)
                                Rows Removed by Filter: 308
                                Buffers: shared hit=70
        ->  Index Scan using stuff_pkey on stuff  (cost=0.29..7.43 rows=1 width=14) (actual time=0.007..0.007 rows=1 loops=50)
              Index Cond: (id = "ANY_subquery".stuff_id)
              Buffers: shared hit=150
Planning time: 0.147 ms
Trigger for constraint ob1_fkey: time=7.689 calls=50
Trigger for constraint ob1_fkey: time=2.805 calls=50
Trigger for constraint ob2_fkey: time=46.560 calls=50
Trigger for constraint ob2_fkey: time=0.870 calls=50
Trigger for constraint ob2_fkey: time=127.108 calls=50
Trigger for constraint ob3_fkey: time=8.882 calls=50
Trigger for constraint ob3_fkey: time=3.165 calls=50
Trigger for constraint ob4_fkey: time=4.039 calls=50
Trigger for constraint ob4_fkey: time=1.821 calls=50
Trigger for constraint ob5_fkey: time=3.283 calls=50
Trigger for constraint ob6_fkey: time=0.925 calls=50
Trigger for constraint ob7_fkey: time=24.129 calls=50
Trigger for constraint ob8_fkey: time=0.775 calls=50
Trigger for constraint ob8_fkey: time=0.736 calls=50
Trigger for constraint ob9_fkey: time=7.769 calls=50
Trigger for constraint ob9_fkey: time=2.903 calls=50
Execution time: 5218.935 ms

// 5. DELETE (6.3 secs)

DELETE -> Query returned successfully: 50 rows affected, 6.3 secs execution time.

// 6. DELETE (16.1 secs)

DELETE -> Query returned successfully: 50 rows affected, 16.1 secs execution time.

// 7. EXPLAIN (ANALYZE, BUFFERS) DELETE -> (16194.277 ms)

Delete on stuff  (cost=2.95..374.55 rows=50 width=38) (actual time=15974.063..15974.063 rows=0 loops=1)
  Buffers: shared hit=202551 read=40860 dirtied=40405 written=18909
  ->  Nested Loop  (cost=2.95..374.55 rows=50 width=38) (actual time=171.462..172.284 rows=50 loops=1)
        Buffers: shared hit=299 read=5
        ->  HashAggregate  (cost=2.66..3.16 rows=50 width=40) (actual time=37.040..37.205 rows=50 loops=1)
              Group Key: "ANY_subquery".stuff_id
              Buffers: shared hit=150 read=3
              ->  Subquery Scan on "ANY_subquery"  (cost=0.29..2.53 rows=50 width=40) (actual time=0.148..37.026 rows=50 loops=1)
                    Buffers: shared hit=150 read=3
                    ->  Limit  (cost=0.29..2.03 rows=50 width=16) (actual time=0.144..37.013 rows=50 loops=1)
                          Buffers: shared hit=150 read=3
                          ->  Index Scan using somewhere_else_pkey on somewhere_else  (cost=0.29..1083.99 rows=31046 width=16) (actual time=0.143..37.008 rows=50 loops=1)
                                Filter: (stuff_id IS NOT NULL)
                                Rows Removed by Filter: 508
                                Buffers: shared hit=150 read=3
        ->  Index Scan using stuff_pkey on stuff  (cost=0.29..7.43 rows=1 width=14) (actual time=2.697..2.697 rows=1 loops=50)
              Index Cond: (id = "ANY_subquery".stuff_id)
              Buffers: shared hit=149 read=2
Planning time: 0.157 ms
Trigger for constraint ob1_fkey: time=6.858 calls=50
Trigger for constraint ob1_fkey: time=2.708 calls=50
Trigger for constraint ob2_fkey: time=40.692 calls=50
Trigger for constraint ob2_fkey: time=0.784 calls=50
Trigger for constraint ob2_fkey: time=137.614 calls=50
Trigger for constraint ob3_fkey: time=8.445 calls=50
Trigger for constraint ob3_fkey: time=2.811 calls=50
Trigger for constraint ob4_fkey: time=3.084 calls=50
Trigger for constraint ob4_fkey: time=1.554 calls=50
Trigger for constraint ob5_fkey: time=2.817 calls=50
Trigger for constraint ob6_fkey: time=0.811 calls=50
Trigger for constraint ob7_fkey: time=1.009 calls=50
Trigger for constraint ob8_fkey: time=0.682 calls=50
Trigger for constraint ob8_fkey: time=0.671 calls=50
Trigger for constraint ob9_fkey: time=6.880 calls=50
Trigger for constraint ob9_fkey: time=2.555 calls=50
Execution time: 16194.277 ms

// 8. EXPLAIN (ANALYZE, BUFFERS) DELETE -> (12959.783 ms)

Delete on stuff  (cost=2.95..374.55 rows=50 width=38) (actual time=12741.876..12741.876 rows=0 loops=1)
  Buffers: shared hit=202521 read=40876 dirtied=40412 written=19353
  ->  Nested Loop  (cost=2.95..374.55 rows=50 width=38) (actual time=0.247..1.043 rows=50 loops=1)
        Buffers: shared hit=288 read=2 dirtied=4
        ->  HashAggregate  (cost=2.66..3.16 rows=50 width=40) (actual time=0.230..0.396 rows=50 loops=1)
              Group Key: "ANY_subquery".stuff_id
              Buffers: shared hit=138 read=2 dirtied=3
              ->  Subquery Scan on "ANY_subquery"  (cost=0.29..2.53 rows=50 width=40) (actual time=0.197..0.217 rows=50 loops=1)
                    Buffers: shared hit=138 read=2 dirtied=3
                    ->  Limit  (cost=0.29..2.03 rows=50 width=16) (actual time=0.192..0.205 rows=50 loops=1)
                          Buffers: shared hit=138 read=2 dirtied=3
                          ->  Index Scan using something_else_pkey on something_else  (cost=0.29..1083.99 rows=31046 width=16) (actual time=0.191..0.200 rows=50 loops=1)
                                Filter: (stuff_id IS NOT NULL)
                                Rows Removed by Filter: 558
                                Buffers: shared hit=138 read=2 dirtied=3
        ->  Index Scan using stuff_pkey on stuff  (cost=0.29..7.43 rows=1 width=14) (actual time=0.008..0.008 rows=1 loops=50)
              Index Cond: (id = "ANY_subquery".stuff_id)
              Buffers: shared hit=150 dirtied=1
Planning time: 0.161 ms
Trigger for constraint ob1_fkey: time=6.897 calls=50
Trigger for constraint ob1_fkey: time=2.567 calls=50
Trigger for constraint ob2_fkey: time=41.091 calls=50
Trigger for constraint ob2_fkey: time=0.783 calls=50
Trigger for constraint ob2_fkey: time=123.893 calls=50
Trigger for constraint ob3_fkey: time=7.853 calls=50
Trigger for constraint ob3_fkey: time=2.814 calls=50
Trigger for constraint ob4_fkey: time=3.616 calls=50
Trigger for constraint ob4_fkey: time=1.639 calls=50
Trigger for constraint ob5_fkey: time=2.889 calls=50
Trigger for constraint ob6_fkey: time=0.818 calls=50
Trigger for constraint ob7_fkey: time=11.826 calls=50
Trigger for constraint ob8_fkey: time=0.691 calls=50
Trigger for constraint ob8_fkey: time=0.669 calls=50
Trigger for constraint ob9_fkey: time=7.008 calls=50
Trigger for constraint ob9_fkey: time=2.614 calls=50
Execution time: 12959.783 ms

// 9. EXPLAIN (ANALYZE, BUFFERS) DELETE -> (20756.127 ms)

Delete on stuff  (cost=2.95..374.55 rows=50 width=38) (actual time=20577.753..20577.753 rows=0 loops=1)
  Buffers: shared hit=203238 read=40998 dirtied=40538 written=27112
  ->  Nested Loop  (cost=2.95..374.55 rows=50 width=38) (actual time=0.224..1.037 rows=50 loops=1)
        Buffers: shared hit=298 read=2
        ->  HashAggregate  (cost=2.66..3.16 rows=50 width=40) (actual time=0.210..0.382 rows=50 loops=1)
              Group Key: "ANY_subquery".stuff_id
              Buffers: shared hit=148 read=2
              ->  Subquery Scan on "ANY_subquery"  (cost=0.29..2.53 rows=50 width=40) (actual time=0.179..0.198 rows=50 loops=1)
                    Buffers: shared hit=148 read=2
                    ->  Limit  (cost=0.29..2.03 rows=50 width=16) (actual time=0.175..0.186 rows=50 loops=1)
                          Buffers: shared hit=148 read=2
                          ->  Index Scan using something_else_pkey on something_else  (cost=0.29..1083.99 rows=31046 width=16) (actual time=0.174..0.181 rows=50 loops=1)
                                Filter: (stuff_id IS NOT NULL)
                                Rows Removed by Filter: 608
                                Buffers: shared hit=148 read=2
        ->  Index Scan using stuff_pkey on stuff  (cost=0.29..7.43 rows=1 width=14) (actual time=0.009..0.009 rows=1 loops=50)
              Index Cond: (id = "ANY_subquery".stuff_id)
              Buffers: shared hit=150
Planning time: 0.157 ms
Trigger for constraint ob1_fkey: time=6.780 calls=50
Trigger for constraint ob1_fkey: time=2.557 calls=50
Trigger for constraint ob2_fkey: time=40.069 calls=50
Trigger for constraint ob2_fkey: time=0.788 calls=50
Trigger for constraint ob2_fkey: time=96.935 calls=50
Trigger for constraint ob3_fkey: time=7.845 calls=50
Trigger for constraint ob3_fkey: time=2.857 calls=50
Trigger for constraint ob4_fkey: time=3.181 calls=50
Trigger for constraint ob4_fkey: time=1.561 calls=50
Trigger for constraint ob5_fkey: time=2.812 calls=50
Trigger for constraint ob6_fkey: time=0.806 calls=50
Trigger for constraint ob7_fkey: time=0.976 calls=50
Trigger for constraint ob8_fkey: time=0.687 calls=50
Trigger for constraint ob8_fkey: time=0.681 calls=50
Trigger for constraint ob9_fkey: time=6.988 calls=50
Trigger for constraint ob9_fkey: time=2.612 calls=50
Execution time: 20756.127 ms

// 10. EXPLAIN (ANALYZE, BUFFERS) DELETE -> (20129.538 ms)

Delete on stuff  (cost=2.95..374.55 rows=50 width=38) (actual time=19294.223..19294.223 rows=0 loops=1)
  Buffers: shared hit=203374 read=41019 dirtied=40558 written=26606
  ->  Nested Loop  (cost=2.95..374.55 rows=50 width=38) (actual time=68.324..69.147 rows=50 loops=1)
        Buffers: shared hit=380 read=4
        ->  HashAggregate  (cost=2.66..3.16 rows=50 width=40) (actual time=0.236..0.401 rows=50 loops=1)
              Group Key: "ANY_subquery".stuff_id
              Buffers: shared hit=231 read=3
              ->  Subquery Scan on "ANY_subquery"  (cost=0.29..2.53 rows=50 width=40) (actual time=0.204..0.223 rows=50 loops=1)
                    Buffers: shared hit=231 read=3
                    ->  Limit  (cost=0.29..2.03 rows=50 width=16) (actual time=0.200..0.211 rows=50 loops=1)
                          Buffers: shared hit=231 read=3
                          ->  Index Scan using something_else_pkey on something_else  (cost=0.29..1083.99 rows=31046 width=16) (actual time=0.199..0.206 rows=50 loops=1)
                                Filter: (stuff_id IS NOT NULL)
                                Rows Removed by Filter: 658
                                Buffers: shared hit=231 read=3
        ->  Index Scan using stuff_pkey on stuff  (cost=0.29..7.43 rows=1 width=14) (actual time=1.370..1.370 rows=1 loops=50)
              Index Cond: (id = "ANY_subquery".stuff_id)
              Buffers: shared hit=149 read=1
Planning time: 0.157 ms
Trigger for constraint ob1_fkey: time=6.715 calls=50
Trigger for constraint ob1_fkey: time=2.524 calls=50
Trigger for constraint ob2_fkey: time=40.881 calls=50
Trigger for constraint ob2_fkey: time=0.810 calls=50
Trigger for constraint ob2_fkey: time=752.259 calls=50
Trigger for constraint ob3_fkey: time=7.940 calls=50
Trigger for constraint ob3_fkey: time=2.897 calls=50
Trigger for constraint ob4_fkey: time=3.566 calls=50
Trigger for constraint ob4_fkey: time=1.672 calls=50
Trigger for constraint ob5_fkey: time=2.901 calls=50
Trigger for constraint ob6_fkey: time=0.841 calls=50
Trigger for constraint ob7_fkey: time=1.034 calls=50
Trigger for constraint ob8_fkey: time=0.702 calls=50
Trigger for constraint ob8_fkey: time=0.681 calls=50
Trigger for constraint ob9_fkey: time=6.994 calls=50
Trigger for constraint ob9_fkey: time=2.646 calls=50
Execution time: 20129.538 ms

In case that is helpful, here are some typical data lengths:

SELECT id, length(data) ...


id       length
12738    6471376
12757    6471376
12734    6471376
12770    5813688
12744    6471376
12775    5813688
12733    6471376
12779    5813688
12765    5813688
12769    5813688
12768    5813688
12760    6471376
12753    6471376
12762    6471376
12759    6471376
12751    6471376
12750    6471376
12771    5813688
12749    6471376
12747    6471376
12752    6471376
12739    6471376
12754    6471376
12732    6471376
12773    5813688
12731    6471376
12746    6471376
12743    6471376
12777    5813688
12741    6471376
12737    6471376
12778    5813688
12761    6471376
12740    6471376
12755    6471376
12763    5813688
12772    5813688
12742    6471376
12766    5813688
12776    5813688
12745    6471376
12780    5813688
12767    5813688
12764    5813688
12758    6471376
12736    6471376
12748    6471376
12756    6471376
12735    6471376
12774    5813688

Best Answer

The vacuum itself operates on each table in isolation. Propagation of the deletion marker from the main table down to the TOAST table happens at deletion time, not at VACUUM time.

Perhaps it theoretically could be done some other way, but this is how it works.