As you may have noticed, as a fully managed data base as a service product, AWS Relational Data base Service (RDS) restricts the execution of userland commands.
I was staging a production database, and got my pg_largeobject
table bloated to 40% of the whole persistency virtual device capacity.
How can I run vacuumlo
(beautifully well explained in other DBA.SE question here) on a RDS instance running PostgreSQL database?
Best Answer
Find here (in the GitHub mirror of the official PostgreSQL repo) sources of
vacuumlo
for the PostgreSQL data base version I was using at that very moment.You can imagine the rest: I just mimic what the program is performing, also simply described here.
1. Temporary table construction.
Prepare the temporary table of object references or OIDs.
1.1. Temporary lob table as copy of full lob table.
1.2. Limit temporary lob table.
Perform the query that returns you all your data base columns typed OID:
Next you have to execute this query for all results fetched by the earlier query (note
${VARIABLE}
is something you should substitute yourself according to your mileage) in order to remove from the temporary table all OIDs of objects actually in use:In my case it was only two tables totallying five columns, and actually both tables empty, so naturally the five
DELETE
queries did not do a thing. If you have a bigger OID enabled subschema you might need to automate this somehow.2. Large objects unlinking.
Finally the program declares a cursor that iterates the remaining
lo
cells of the temporary table, purging them with alo_unlink
function call.2.A. Do not do it this way.
I should have automated that with a PLPGSQL stored procedure, but since I suck at that kind of tasks, I just issued this one liner:
Then this other that iterates selecting the first orphan OID in the temporary table of orphan OIDs then unlinking it and removing it from the table:
I knew it was suboptimized as hell, but I let it slowly removing those orphan records. When not being DBA at all those one liners can be easier to forge than working some meaningful idiomatic PLPGSQL.
But this was too slow to leave it like this.
2.B. Do this better than 2.A (though not a silver bullet yet).
You will be able to speed up large object unlinking with something simple such as:
NOTE it is not required to unlink large objects
100
at a time, not even a particular numberx
of them at a time, but unlinking100
at a time is the safest base applicable to all AWS instance sizes' default memory configuration. If you use an excessively large number for this, you risk function failure because of insufficiently assigned memory; how you can do will depend on the amount of objects to unlink and their size, on instance type and size, and on degree of manual further configuration applied.Which is somewhat easy to forge for a non DBA person, then calling it with something like
Where
${WHATEVER}
is a constructed constant depending on the temporary large objects' table size and the number of locks per transaction your configuration is allowing (I am using RDS defaults but iterating frombash
I guess I don't even have to get to know which is the largest number oflo_unlink
s the RDBMS is allowing with currentmax_locks_per_transaction
.3.
VACUUM
large object tables.Hinted by this thread in the postgres mailing list I understood I should
VACUUM
pg_largeobject
after unlinking the objects.I am not sure which of these are the minimum required set, or the proper time for their execution, but some of them might be of help, and none should cause any damage: I ran
VACUUM ANALYZE VERBOSE pg_largeobject_metadata; VACUUM ANALYZE VERBOSE pg_largeobject; VACUUM FULL ANALYZE VERBOSE pg_largeobject_metadata; VACUUM FULL ANALYZE VERBOSE pg_largeobject;
several times while the micro instance was unlinking the objects (took a really large time, alas), firstly when about 1/4 objects already got unlinked, and some little storage was given back to OS, secondly when about 1/3 objects already got unlinked, and some another little storage was given back to OS, thirdly when about 3/5 objects already got unlinked the instance underwent a massive storage giveback to OS:That massive storage giveback was what I was looking for. After running the query for largest tables found at the official home page, with less than 3/4 total objects unlinked, the objects table got shrunk to less than 3GiB, far from the initial and even more bloated 20GiB.
NOTE automating fast
VACUUM ANALYZE
iteration on tables has the same long term effects (of cleaning the table from using excess disk storage) as executing a singleVACUUM FULL
, but without acquiring an exclusive lock.