PostgreSQL – Storing Historical Data in Database vs On Disk

postgresql

I've been pondering on alternative solutions to speed up data retrieval from my DB.
I currently have a relatively large database, about 1 billion rows and growing. We expect the DB to grow linearly with about 1.5 billion rows (300GB) per years. My DB is PostGreSQL on FreeBSD.

The database contains history sensor data which simply consists of a timestamp, a numeric value and 1 foreign key to a sensor. The data will never be updated, only written and is not accessed very often.

With this amount of data is it's not possible to keep everything in memory, but I still want fast data retrieval if possible. We're currently lookup at almost 6ms per table row lookup time because it's so far spread out over the harddisk that every lookup has maximum seek time.

I've been doing some small scale tests with storing data on disk and this seem a LOT faster. We're looking at 1.8s for a day from the DB versus 0.2ms retrieving it from disk (including value conversions). This is almost 10.000x faster.


Which leads to my question: Is there any particular reason not to store the history data as plain files on disk if I don't care about foreign key consistency?

Best Answer

I think you've answered your own question.

If there's no need for foreign key consistency and you are only ever reading back this data then the difference in search times makes a compelling argument for storing the history data as plain text rather than in a database.