I'm adding a unique index (concurrently, over the course of hours or maybe even days, on a column of 5 billion bigints, all of them NULL) on a Postgres 9.6 server that's pretty low on disk space. I can see that it's increased disk utilization from 94% to 95%, and I'm not sure if it's set aside all the space it wants, or if it's going to write more to disk later and take down the server over the weekend.
Does Postgres set aside disk space for indexes in advance, use more disk space gradually over the entire course of indexing, or write intermittently?
PostgreSQL Index – Does Adding Concurrently Take Up Disk Space Gradually or All at Once?
disk-spaceindexpostgresqlpostgresql-9.6
Related Question
- PostgreSQL 9.5 BRIN Index – Troubleshooting Slow Performance
- Postgresql – postgres_fdw and postgres version upgrade
- PostgreSQL – Setting Location of Temp Files for Backend (9.3.10)
- PostgreSQL – How to Add an Index on a New Column
- PostgreSQL Indexing – Creating Composite Index on Bigints
- How Does PostgreSQL Store Many Varchars on Disk?
Best Answer
It allocates the space slowly.
First it sorts all the data it needs into temp files, allocating as it goes. Then it builds the index, allocating that as it goes as well. Then it removes the temp files. In the case of a concurrent index, it then sorts a portion of the data again into temp files so that it can detect any concurrently written data it missed the first time, again allocating as it goes. Finally it removes those temp files again, and the index is done. If you have another partition/drive available, you can make the temp files go into separate temp_tablespaces on that drive rather than sharing space with the main files.
Also, you could create a partial index like
create index CONCURRENTLY ON foobar (x) where x is not null;
. Since this column is all NULL, this would have no entries, and it would take almost no space at all--either to build it, or in the final product.