Linux – Does RAID1 increase performance with Linux mdadm

linuxmdadmperformanceraidraid-1

I have a cheap 2-bay NAS with a 2TB HDD. To be robust against disk failure, I'm thinking of buying a second 2TB HDD and putting it in RAID1 with Linux mdadm. The file system is ext4.

Will this increase or decrease the performance of the NAS? What about just read or write performance?

There seem to be lots of opinions about this online but no consensus.

Thanks.

Edit:

So already I've got three different answers: "a fair bit faster", "you wont notice" and "will decrease the performance if anything". (I am interested primarily in read performance.) Wikipedia says "the read performance can go up roughly as a linear multiple of the number of copies". Which one is it?

Edit 2:

I've found mounting evidence in support of RAID1 increasing read performance, including the MD manpage:

Changes are written to all devices in parallel. Data is read from any one device. The driver attempts to distribute read requests across all devices to maximise performance.

I also discovered MD's RAID10 with --layout=f2, which provides redundancy of RAID1 with the read performance of RAID0, and can be used with just two drives. The write performance is reduced however, as a sequential write involves both drives seeking back and forth between distant parts of the drive. man md for details.

Best Answer

Yes, Linux implementation of RAID1 speeds up disk read operations by a factor of two as long as two separate disk read operations are performed at the same time. That means reading one 10GB file won't be any faster on RAID1 than on a single disk, but reading two distinct 10GB files*will be faster.

To demonstrate it, just read some data with dd. Before performing anything, clear the disk read cache with sync && echo 3 > /proc/sys/vm/drop_caches. Otherwise hdparm will claim super fast reads.

Single file:

# COUNT=1000; dd if=/dev/md127 of=/dev/null bs=10M count=$COUNT &
(...)
10485760000 bytes (10 GB) copied, 65,9659 s, 159 MB/s

Two files:

# COUNT=1000; dd if=/dev/md127 of=/dev/null bs=10M count=$COUNT &; dd if=/dev/md127 of=/dev/null bs=10M count=$COUNT skip=$COUNT &
(...)
10485760000 bytes (10 GB) copied, 64,9794 s, 161 MB/s
10485760000 bytes (10 GB) copied, 68,6484 s, 153 MB/s

Reading 10 GB of data took 65 seconds whereas reading 10 GB + 10 GB = 20 GB data took 68.7 seconds in total, which means multiple disk reads benefit greatly from RAID1 on Linux. skip=$COUNT part is very important. The second process reads 10 GB of data from the 10 GB offset.

Jared's answer and ssh's comments refering to http://www.unicom.com/node/459 are wrong. The benchmark from there proves disk reads don't benefit from RAID1. However, the test was performed with bonnie++ benchmarking tool which doesn't perform two separate reads at one time. The author explictly states bonnie++ is not usable for benchmarking RAID arrays (refer to readme).

Related Question