Postgresql – How is it technically possible for PostgreSQL to keep small tables “in RAM”

postgresqlwrite-ahead-logging

I used to ask about performance and people would tell me that PostgreSQL keeps small-enough tables entirely in RAM automatically, as to not stress the HDD and thus enable super-fast reading/writing of "small tables", that is, with very few rows which more than easily fit in the available RAM (or what's assigned to PG).

This made sense to me, and made me very happy as I can efficiently communicate between my parallel-running scripts without stressing my poor HDD/SSD and clogging up all the resources.

But then it struck me: How is this possible? How can PG possibly keep a table all in RAM, regardless of how small it is, without losing data integrity?

If it doesn't write to the permanent storage all the time, how can it possibly recover if the power is suddenly cut, or there's a software crash, or some other disaster happens while it's running? It doesn't add up to me the more I think about it.

The only answer I can think of is: No, it doesn't actually do this. I don't see any possible way that PG could avoid losing data if it never (or rarely) writes the data to the permanent storage.

If this is possible, but only after enabling some kind of flag for the table, or setting, I'd like to know about that flag/setting. I'd like to enable this for tables which only contain "internal communication" data of the kind which has no value besides as a "common data storage" for actively running scripts.

Best Answer

This is possible due to Write-Ahead Logging (WAL):

Briefly, WAL's central concept is that changes to data files (where tables and indexes reside) must be written only after those changes have been logged, that is, after log records describing the changes have been flushed to permanent storage. If we follow this procedure, we do not need to flush data pages to disk on every transaction commit, because we know that in the event of a crash we will be able to recover the database using the log: any changes that have not been applied to the data pages can be redone from the log records.

It's a central component of Postgres' architecture and turned on by default. You can configure a couple of parameters like instructed in the manual chapter WAL Configuration.

Also consider basics on Reliability in the manual.