Context : Our web-app is like a collaboration platform where you can share files, appointments, survey and so on. An instance of this web-app is divided in communities, each one is independent to each other but share the same PostgreSQL database.
Target : actually what i'm trying to do is to write a shell script in order to make regular backups of communities separately. This means that I need to select only part of the data inside the various Postgre tables and export/dump the community data.
Obstacle :
My big problem is with the pg_largeobjects table, used to store the BLOB of files. This table use OIDs as pk (something I don't relly like by the way). The table files (containing files metadata) , has a reference to the pg_largeobject OID.
Solutions considered :
-
- Select all the OIDs from files table WHERE community_id = something.
- And afterwards, I could export the data stream from pg_largeobjects using
[lo_export][1]
function giving all the OIDs selected in the previous operation:int lo_export(PGconn *conn, Oid lobjId, const char *filename);
The problem with this approach is when i will try to restore pg_largeobject records, I need to keep the same OIDs or the relation will be broken. I can use the function lo_import_with_oid
that permit to define the desired OID, but in most cases OIDs will be already occupied from the original data, and for this reason the import will generate a new OID, that will not satisfact the relation with the rest of the database.
- I still don't know how I could achieve this, but i try to explain:
- during the export I could select and save all the PK of the files table record of the certain community (let's call it filesPKs).
- I will then perform a
[lo_export]
of pg_largeobjects - and during the
lo_import
I could ignore the fact that new OID are being generated. But in someway i would save the list of these new OIDs inserted. - as final step I could update all the rows(using filesPKs reference) in the files table in order to make them point to the new imported pg_largeobject records.
This second procedure in my opinion is very difficult, because I still don't see how could I update files rows in order to point to the correct largeobject.
- An other option, that fancifully would be the best one, would be to modify the pg_largeobject table in order to have an additional column with a (static) reference to files table. But after a test it seems to be impossible:
testdb=# alter table pg_largeobject add column prova text; ERROR:
permission denied: "pg_largeobject" is a system catalog
Questions :
-
Any one else faced this issue?
-
Am I just missing something, or taking the wrong approach?
- Out there exist an other way to export/dump pg_largeobjects records, exporting all the fields, not only the data stream?
Best Answer
I managed to resolve this issue as follow (2nd considered solution - revisited):
All in a while cicle, for every file: