I will have a huge PostgreSQL 9.3 database with many tables with more than 100M entries per table. This database will be basically be read-only (once I fill all the necessary tables and build the indexes no more write operations on the DB) and single-user access (run and benchmark multiple queries from localhost), since the DB will be used only for research purposes. Queries will always use JOIN on integer DB fields.
I will probably buy a SSD (256-512GB) for this purpose. I have not used an SSD for a DB before, so is there anything I should be afraid of? Can I put the entire DB on the SSD, or just the indexes? Is there any particular advice / tutorial required for tuning PostgreSQL for SSDs? Note, that I have a good workstation with an i7 and 32Gb of RAM, so perhaps you can offer some advice there too.
Best Answer
Not having backups. Like any storage device, it can die. Keep backups.
If the data load is going to take ages, I'd back up the read-only db once I'd done the data load, by stopping it and copying it. That way if something went wrong it'd be easier to re-create later.
If it fits, store the whole DB.
If it doesn't, put a tablespace on the SSD and use it to store the indexes and as many of the heavily queried tables as will fit.
Most of the benefits of SSDs are for OLTP write loads. The main advantage for read-only loads is fast seeks, and slardiere has covered that.
You might want to set
effective_io_concurrency = 5
or something to reflect the fact that SSDs can do fast, heavily pipelined random reads ... but it only affects bitmap index scans, and in practicerandom_page_cost
already incorporates that.For a read-only load it doesn't make a ton of difference.
For the initial data load, see:
Set a big
maintenance_work_mem
for the data load. I'd use at least8GB
.Set a big
work_mem
for the querying work. Appropriate size depends a bit on query complexity. Start with500MB
and go up from there.Bump up your
checkpoint_segments
(massively) for the initial data load.Remember to disable VM overcommit! (see the PostgreSQL manual: http://www.postgresql.org/docs/current/static/kernel-resources.html)