How to Determine Optimal Value for pg_stat_ramdisk_size on Amazon RDS

amazon-rdspostgresql

Amazon announced they support a RAM disk for the stats_temp_directory in PostgreSQL on RDS.

This sounds like a good idea so I want to enable it. Now I'm wondering how to determine the optimal size for the RAM disk. Is there a way to show the current size of the directory on disk? The file system of PostgreSQL on RDS is not accessible, isn't it?
Also, what happens if the RAM disk is configured too small?

Best Answer

The pg_stat_ramdisk_size controls the location and size of the stats_temp_directory on RAM. It's easy to determine the size on bare metal or out of the RDS byt issuing a du -h over the directory specified on the variable. There is no way yet to access this folder size at RDS, will try update here if any feature is added for this.

The used space on this folder, will depend strictly on the settings you have for track_* variables (See 27.2.1. Statistics Collection Configuration ). It is not necessary a huge amount of space, although is very write intensive as it holds the counters per global and database basis (counters over database per table/index/functions). So this also means that is more related to the amount of databases and tables+indexes (as pgstat_write_statsfiles at postmaster/pgstat.c iterates, writing the PgStat_StatDBEntry) than the amount of data.

You don't want to have it smaller(as a crash will happen), so you may want to start with a value over the 256MB.

Update:

Per configuration, it won't let you set pg_stat_ramdisk_size higher than 1GB.

Update 2:

PGSTAT_FILE_FORMAT_ID 1 byte

PgStat_StatTabEntry = 164 bytes

PgStat_StatFuncEntry = 28 bytes

closingChar = 'E'

describers is a char (T or F in this case)

First of all, as it'll explained later, not all the tables, indexes and functions are written on the db statsfile. Basically, a basic formula will be _SizeOfDBStatFile = PGSTAT_FILE_FORMAT_ID + describers + (tableCount * PgStat_StatTabEntry) + (funcCount * PgStat_StatFuncEntry) + closingChar_.

A select count(*) * 164 "size in bytes" from pg_class where relkind IN ('r','i','S') will give you the estimate for the tables if all of them were flushed on the file. Also, you need to do the same within pg_proc, but instead the factor will be 28 bytes.

This file is one per database.

PgStat_StatDBEntry = 180 bytes

PgStat_GlobalStats = 92 bytes

PgStat_ArchiverStats = 114 bytes

describer is a char (D)

The global statfile is smaller, and contains only the global stats and the counters across databases. Should be something close to _PGSTAT_FILE_FORMAT_ID +describer + PgStat_GlobalStats + PgStat_ArchiverStats + (PgStat_StatDBEntry + describer) * numDatabases_.

So, as you can see, the limitation imposed by AWS in regarding is way above the amount of data held on this directory in most of the databases that can run inside RDS expectations.

Reference