PostgreSQL Index – Does Adding Concurrently Take Up Disk Space Gradually or All at Once?

disk-spaceindexpostgresqlpostgresql-9.6

I'm adding a unique index (concurrently, over the course of hours or maybe even days, on a column of 5 billion bigints, all of them NULL) on a Postgres 9.6 server that's pretty low on disk space. I can see that it's increased disk utilization from 94% to 95%, and I'm not sure if it's set aside all the space it wants, or if it's going to write more to disk later and take down the server over the weekend.
Does Postgres set aside disk space for indexes in advance, use more disk space gradually over the entire course of indexing, or write intermittently?

Best Answer

It allocates the space slowly.

First it sorts all the data it needs into temp files, allocating as it goes. Then it builds the index, allocating that as it goes as well. Then it removes the temp files. In the case of a concurrent index, it then sorts a portion of the data again into temp files so that it can detect any concurrently written data it missed the first time, again allocating as it goes. Finally it removes those temp files again, and the index is done. If you have another partition/drive available, you can make the temp files go into separate temp_tablespaces on that drive rather than sharing space with the main files.

Also, you could create a partial index like create index CONCURRENTLY ON foobar (x) where x is not null;. Since this column is all NULL, this would have no entries, and it would take almost no space at all--either to build it, or in the final product.