I have a large PostgreSQL database over 500GB in size which is too large. Is there anyway to compress the database down to a more manageable size? I have attempted to do this with SquashFS and the database compressed down to 177GB however PostgreSQL requires that the database have write access and Squashed systems are Read-Only. Do more experienced database users have any suggestions to accomplish this goal?
The database holds GIS data for the planet and will be used locally on a deployed system. Currently it sits on a 1TB SSD, however, I am trying to avoid slapping in an additional hard drive simply to accommodate a large database. The database performs as desired with no issue, I would simply like to compress it down to a more manageable size and avoid placing it on a separate drive.
Best Answer
File system
A very popular method of doing this is with the file system. BTRFS and ZFS works under the database at the file system level. Both can be used in a loopback device so you can provide a compressed tablespace without having another partition. There are caveats with this, if the tablespace fails it may take your cluster too.
ZFS
ZFS is the big one here. It's what I would go for.
See also
Btrfs
Btrfs is a strong contender but it's been in active development for a very long time, and the lack of major distros picking it up as a default has many people questioning whether or not it's ready for "prime time."
PostgreSQL
GIS Methods (PostGIS)
For Polygons, one method is to simplify polygons by dropping vertices with
ST_Simplify
.For Points, one method is spatial clustering.
Both of these result in a loss of information. PostGIS, like most of the features of the database, doesn't have a transparent "magic compression" option.
cstore_fdw
There is also
cstore_fdw
which is a columnar store that offers compression. It has a different performance profile so ymmv.