I have tested this with ZFS and write performance is about half what it should be, because ZFS distributes reads and writes over all vdevs (therefore dividing I/O to several places on the same disk). Thus, the speed is limited by the speed of the disk with the most partitions. Read speed seems to be equal to the disk bandwidth. Note a pair of ZFS partitions on two disks has roughly double the read speed of either single disk, because it can read from the disks in parallel.
Using MD LINEAR arrays or LVM to create the two halves results in twice the write performance compared to the above ZFS proposal, but has the disadvantage that LVM and MD have no idea where the data is stored. In the event of a disk failure or upgrade, one side of the array must be entirely destroyed and resynced/reslivered, followed by the other side. (e.g. the resync/resliver has to copy 2*(size of array))
Therefore it seems then that the optimal solution is to create a single ZFS mirror vdev across two LVM or MD LINEAR devices which combine the disks into equal-sized "halves". This has roughly twice the read bandwidth of any one disk, and write bandwidth is equal to the individual disk bandwidths.
Using BTRFS raid1 instead of ZFS also works, but has half the read bandwidth because ZFS distributes its reads to double the bandwidth, while it appears BTRFS does not (according to my tests). BTRFS has the advantage that partitions can be shrunk, while they cannot with ZFS (so if after a failure you have lots of empty space, with BTRFS it's possible to rebuild a smaller redundant array by shrinking the filesystem, then rearranging the disks).
This is tedious to do by hand but easy with some good scripts.
What happens if I have to change the underlying hardware behind the
zfs pool? Like the mobo / processor, what happens if that dies on me
in a year or two; can I port my zfs pool somehow?
A ZFS pool is not hardware dependent. Just make sure your HBA (Host Bus Adapter) isn't doing something like encrypting your data at the hardware level. ZFS works best with a HBA like an LSI 9211-8i or an IBM m1015 cross-flashed to use the 9211-8i firmware, not a full blow "hardware" RAID card.
I've got quite the set of different sized drives, and i'm trying to
get the most storage space out of it with redundancy. What is the best
setup for this config, and how much space will I be losing by using
these different size drives. I am not creating this for any speed
requirements, I just want a file server for multiple HTPCs. My
currently available drives for this are: 1x 500GB 'Hybrid' Drive 1x
1TB Drive 1x 3TB Drive 1x 4TB Drive (will be added to the pool later,
currently holding all the data from the the drives listed above)
If I were you I would sell the smaller drives and put the money towards larger drives of all the same size. It will make your life a lot easier. Also, you cannot just add drives to a ZFS pool. There are constraints. Read here.
Will adding the 4TB drive to the pool later be a problem of any kind?
Possibly. I am in a similar position. At some time in the future I will have to increase my storage capacity. At that time I plan on purchasing a second HBA and a new array of larger drives. I will then transfer all the data from my existing drives to my new drives then sell my existing drives. There may be other (cheaper) ways around this, but doing it this way:
- Keeps all of my drives the same size
- Only has the additional cost of an extra HBA, which isn't a bad thing to have laying around anyhow
- Does not require me to replace my drives one at a time, re-silvering after each replacement.
Any recommendations on a Linux OS to run this all on, and should I use
a separate drive for the OS? I'm familiar with Ubuntu, RHEL, and
OpenSUSE / SLES.
Don't use Linux, it does not have native ZFS support. Linux support of ZFS comes from ZFS on Linux and zfs-fuse. The current state of ZFS is in flux as Oracle tries their best to ruin it. ZFS will likely branch at version 28 in the very near future, so don't make your ZFS pool with any version greater than 28 unless you are 100% certain you want to stick with an Oracle solution. Currently FreeBSD and its spinoffs support ZFS version 28.
Since you are a self proclaimed ZFS noob I would recommend FreeNAS. I have been using it for awhile now and I'm pretty happy with it. It will definitely allow the most straight forward setup for you.
Additional Thoughts:
Make sure you choose the correct level of parity for your particular use case. Specifically, make sure you plan around URE. Basically you don't want to use RAID 5 (RAID Z1) if you are using anything larger than 2TB drives. There are some other factors to consider that may prompt you to increase your level of parity data as well. Here is a good article on the subject.
Update:
It has been 1.5 years since I posted this answer and in that time I have been giving ZFS on Linux (Ubuntu server specifically) another chance. It has come a long way since I tried first tried it and I'm pretty happy so far. My reason for switching was the installation restrictions on FreeNAS and the jailing system. I wanted to use my server for more than just a NAS server and FreeNAS makes that hard. The jailing system is good and very secure, but I didn't really need that level of security in my home and I didn't want to deal with logging into a jail every time I wanted to unzip a file. I think FreeNAS is still a good choice if you are just getting started with ZFS (because of the web interface) or if you just want a NAS appliance (i.e. no other server functionality needed).
Best Answer
Yes, this is possible. If you read a little on ZFS, you’ll find that it’s basically a pool of so-called “vdev”s. The simplest vdev would be a plain physical drive. It could also be a mirror consisting of two or more physical drives. This is what you want.
You’d go for this structure:
To create this zpool, use the following command:
This will result in a usable capacity of 9 TB. It can tolerate one drive failure per mirror vdev. (Unless you add more mirrors, of course.)
If you want to add vdevs later, use this command:
To extend the pool size, first enable the
autoexpand
option:Then replace one of d3/d4 with a larger drive and wait for it to rebuild. After that, replace the other. The pool should automatically expand to the available drive size.
It might be desirable to turn off
autoexpand
after the job is done.Alternatively, you can leave
autoexpand
alone and use the following commands after you replace both drives: