PostgreSQL Index Fillfactor – Understanding Index Fillfactor in PostgreSQL

fill-factorindexpostgresqlpostgresql-9.4

I'm trying to understand the index FILLFACTOR parameter on a simple example. I have a table that is generally updated only once in ten days, but the update is quite heavy (requires 5-6 hours).

Now, I tend to drop the index on a column of the table, update the data and then create the index again with FILLFACTOR = 100.

Is that a good practice? I'm not sure about that because I don't understannd the FILLFACTOR concept in general. By default it's 90% and they say that this is needed to perform updates and insertions.

But I tried to create an index with FILLFACTOR = 100 and then update some row and add new one and it executed fine. But I have autovacuum turned off. Running ANALYZE, however required only 3secs.

So, how does the free space left by fillfactor work?

Best Answer

Free space within the table can be used to speed up certain kinds of updates on tables, do inserts faster by avoiding the need to extend the table, and make index growth more efficient.

If your table only changes infrequently and you don't mind reindexing it (or dropping & re-creating the index) each time, that's totally fine. A fillfactor of 100 will save a little bit of I/O when scanning the index, and the spare space is no use to you unless you do inserts/updates on the indexed table.

Related Solutions

Minimum Row Count for Filtered Index in SQL Server 2008

For example, if you have 100,000 rows in a table that gets, say, 75% read operations, would it be wise to add a filtered index that only covers 500 rows? Or 100 rows? What if the filter covers 85,000 of those rows?

That would depend on the queries against the table. If query patterns are such that they target a subset of the 500 or 100 rows that your filtered index covers, perfect. On the flip side, it's unlikely that the optimiser is going to choose a filtered index that includes 85% unless it is a covering index for a particular query.

I haven't tested this but I would expect filtered index utilisation to exhibit the same tipping point behaviour as non-filtered indexes.

postgresql – How to Reclaim Space from a Partially Built Index

Normally we'd expect that when postgres was restarted, the crash recovery process would have removed files related to a rollback'ed index from the data directory.

Let's assume that it didn't work, or at least that it has to be checked manually.

The list of files that should be in the datadir can be established with a query like this:

select pg_relation_filenode(oid)
   from pg_class
  where relkind in ('i','r','t','S','m')
    and reltablespace=0
  order by 1;

reltablespace=0 is for the default tablespace. If the problematic index was created in a non-default tablespace, this 0 must be replaced by its OID in pg_tablespace.

i,r,t,S,m in relkind correspond respectively to indexes, tables, toast space, sequences, materialized views. All these objects have their data in files whose names match pg_relation_filenode(oid).

On disk, the data files are below $PGDATA/base/oid/ where oid is the oid of the database obtained by select oid,datname from pg_database. If we're not talking about the default tablespace, base is replaced by PG_version_somelabel instead.

List and sort the files matching relfilenodes in that directory:

ls | grep -E '^[0-9]+$' | sort -n > /tmp/list-of-relations.txt

(that actually keeps only the first segment for relations that are larger than 1Gb. If there are lingering segments not attached to anything they should be considered separately)

and diff that file with the result of the query above.

If there are lingering data files that do not correspond to any object that the db knows about, they should appear in that diff.

Best Answer

Related Solutions

Minimum Row Count for Filtered Index in SQL Server 2008

postgresql – How to Reclaim Space from a Partially Built Index

Related Question