I have a quite big PostgreSQL database (with timescale plugin). Right now it is consuming about 500Gb on a SSD. Most of the data is in the form of time series. In most cases data older than a few months isn't really interesting.
My idea was to move that data to a cheap SATA hard drive instead of buying more expensive SSD's. Is that a good idea, and is there some good practice for implementing?
My naive implementation would be:
Keep two databases (or create a tablespace on the cheap HDD). Fetch every few hours data from "fast" (SSD) database to the "slow" database (HDD). Every few days, delete data from the slow database. Is this a good idea? I am happy to hear some feedback and better suggestions.
Best Answer
Here is a better architecture:
Create a new tablespace on the slow drive.
Set the storage parameters
seq_page_cost
andrandom_page_cost
higher on that new tablespace so that the PostgreSQL optimizer knows that the disks are slower.Partition the big time series tables by time ranges (use the same boundaries for all affected tables) so that you end up with a couple of dozen partitions for each.
Move the old partitions to the slow tablespace.
Then you still have all the data accessible.
Use PostgreSQL v11 or better for partitioning.