PostgreSQL Performance – ‘Could Not Write to Hash-Join Temporary File: No Space Left on Device’

performancepostgresqlpostgresql-9.1postgresql-performancetemporary-tables

I was getting this error a while ago and I was curious about two things –

1- How do the internals of postgres work that cause this message to occur in the first place? What is the hash-join in reference to the creation of the temporary file that the db creates? I know it's related to creating a temp file that is out of scope of the memory allocation for the database but I don't know why it's called a hash-join.

2- How can I predict and prevent errors like this (related to performance and memory) occurring in the future?

Best Answer

1- How do the internals of postgres work that cause this message to occur in the first place?

When joining two very large data sets, one strategy PostgreSQL can use is to scan one table and hash its join columns. Then scan the other table and compare the hashes of the join columns to hashes calculated for the second table.

See https://en.wikipedia.org/wiki/Hash_join

It works really well when you're joining a small table against a much larger one, especially where no indexes suit the join conditions.

PostgreSQL can use a hash join strategy for bigger tables where it writes the hash table partitions out to disk first. This is presumably what you got.

What is the hash-join in reference to the creation of the temporary file that the db creates?

Hash tables. I don't really understand this question.

2- How can I predict and prevent errors like this (related to performance and memory) occurring in the future?

Have more spare disk space for temporary storage. It doesn't have to be in the main tablespace; see the temp_tablespaces setting.

As far as I know PostgreSQL doesn't have the ability to estimate hash table sizes and decide the hash table will be too big for the available disk space.