I have two identical servers (in terms of hardware), they are both standard installations of windows server 2008 r2, with minimal software installed (basically my code and required stuff like jvm etc).
On the one server, I am running sql server 2005, on the second server postgresql 9.1. The difference in performance b/n these 2 servers is staggering, it's so bad on postgresql that I'm regretting my initial "let's use postgresql instead of paying for the sql server license" speech to my boss. We're talking differences of 30 seconds vs 15 mins for the same command, and it's not just this one command, it's any query or command I throw at it. They both have pretty much the same data (records were inserted in different order), and both databases have the exact same structure / indexes etc.
But I'm hoping it's just a matter of performance tuning. The thing is, sql server is pretty much using all 32 gigs of ram on the server, whereas postgresl is using nothing, definitely less than a gig though I haven't actually figured it out in fine detail.
How do I get postgresql to use 20+ gigs of ram? These servers were built specifically for this database stuff, so any ram not in use by the database and supporting processes is wasted in my opinion.
Best Answer
There are many tweakable constants, initialised via
postgres.conf
. The most important ones are:max_connections
: the number of concurrent sessionswork_mem
: the maximal amount of memory to be used for intermediate results such as hash tables, and for sortingshared_buffers
the amount of memory dedicated to 'pinned' buffer space.effective_cache_size
the amount of memory assumed to be used by the OS's LRU buffers.random_page_cost
: an estimate for the relative cost of disk seeks.max_connections
should not be set higher than needed, connections cost resources even when idle; in most cases a connection would spend more time waiting inside than waiting outside. (at the price of concurrency) A nice rule-of-thumb formula is "number of spindles+number of processors+X"work_mem
is tricky: is can be applied to every subquery, so a query with 5HASHJOINS
might cost 5*work_mem
. And for worst-case scenarios, you should also think of multiple sessions consuming this amount (again a reason to keepmax_connections
low).shared_buffers
is (IMHO) overrated. Normally it is advised to set it to about 1/4...1/2 of all available "free" memory, but I tend to keep it low, and seteffective_cache_size
to all available "free" memory.random_page_cost
is the cost for a seek+read on the disk. It is relative to thesequential_disk_cost
, which is 1. The default (4) forrandom_page_cost
is set too high for modern machines and network storage, normally it can be lowered to between 2 and 1.x. For SSD disks yould even set it to 1.0, since seeking is almost for free on SSDs.