AWS EC2 – How to Utilize Instance Store (SSD) in a Database Environment

amazon ec2postgresqlUbuntu

I am trying to set up a database environment with Ubuntu and PostgreSQL (I didn't want to use RDS because I'm a Ubuntu person).

I looked at description for the R3 instance type:

Use Cases

High performance databases, data mining & analysis, in-memory
databases, distributed web scale in-memory caches, applications
performing real-time processing of unstructured big data, Hadoop/Spark
clusters, and other enterprise applications. R3

R3 instances are optimized for memory-intensive applications and offer
lower price per GiB of RAM.

Features:

High Frequency Intel Xeon E5-2670 v2 (Ivy Bridge) Processors
SSD Storage
Support for Enhanced Networking

So, I went for the SSD Storage, and thought I can somehow utilize it for fast database operations.

Now when I begin to launch a Ubuntu instance, e.g. :

Ubuntu Server 16.04 LTS (HVM), SSD Volume Type – ami-da05a4a0

I realize that I cannot use the SSD storage for the root device hosting PostgreSQL and all the system software. Moreover, the instance store is said to be ephemeral and data are lost if the AWS EC2 instance is stopped. The only use case I was able to find to use the instance store as Linux swap space.

I'm very new to AWS EC2. I'd just like to ask how should the SSD instance store be utilized in a database server (other than as swap)?

So far, the use case I can think of (swap, mount as /tmp) seems to be a waste of the money spent on reserving the SSD storage.

Best Answer

The ephemeral disk is excellent for exactly the two use cases you mentioned, swap and /tmp. Don't underestimate the performance benefits, here -- the ephemeral disks are not only SSDs, they're also physically inside the host machine, unlike EBS, which is network-attached, so you get high bandwidth and low latency to the storage device.

Otherwise, their applications are somewhat limited for "normal" database setups... but in certain other applications, anything where the machine itself is also more ephemeral by nature (QA/dev environments, scalable read replicas), they can offer excellent performance.

If you're the experimenting/creative type, you might toy with a RAID-1 array split between EBS and ephemeral (which could theoretically improve your read throughput without jeopardizing durability).

seems to be a waste of the money spent on reserving the SSD storage.

Ephemeral disks don't cost extra -- they are included in the price of the instance, so the cost for a given instance type is the same whether you activate them or not.

Incidentally -- always enable them when you launch an instance that includes them, because you can't activate them later.