Short:
Longer:
ZFS uses a very unique and very special internal structure, which does not fit well to the structure of ext4. (Read: Perhaps somebody could create some artificial ZFS structure on top of some ext4, but I doubt this conversion is quick, reliable and easy to use.)
SnapRAID, however, does not need that you convert anything. Just use it with (nearly) any existing filesystem, to create redundancy for the files there, such that you can check and recover files in case there is drive failure or (silent) corruption.
Pros/Cons:
SnapRAID is inefficient if it must create redundancy for many small files,
as each file creates a certain overhead (Padding) in the parity.
SnapRAID does not offer compression itself.
On a Media Server you usually do not need compression, as media usually is
compressed already (MP4, JPG, PDF).
If you happen to use some filesystem which allows compression, you can use it.
But only on device level, not on the complete pool (like ZFS does).
SnapRAID does not offer deduplication on a block level.
On a Media Server the snapraid dup
feature usually is enough,
as media files normally do not share a lot duplicate blocks.
(Exception: youtube-dl
. If you download a video two times with same quality,
it differs in a few bytes only. Not always. But quite often.
Just keep the Youtube video ID in the filename to identify two similar files.)
ZFS dedup needs a lot of memory. Plan 1 GiB RAM per 1 TiB data, better more!
If there is not enough RAM you need to add some hyper fast SSD cache device.
ZFS needs to lookup 1 random sector per dedup block written, so with "only"
40 kIOP/s on the SSD, you limit the effective write speed to roughly 100 MB/s.
(Usually ZFS is capable to utilize the parallel bandwidth of all devices,
so you can easily reach 1 GB/s and more write speed on consumer hardware these
days, but not if you enable dedup and do not have enormous amounts of RAM.)
Note that I never had trouble where SnapRAID was needed to recover data. So I cannot swear, that SnapRAID is really able to recover data.
Edit: There already was enough trouble at my side, and SnapRAID always worked as expected. For example some time ago a drive went dead and I was able to recover the data. AFAICS the recovery was complete (from the latest SNAP taken). But such recovery process can take very long (weeks), and it looks to me that it is not as straight forward as with ZFS, expecially if the recovery process must be interrupted and later be restarted (with SnapRAID you should exactly know what you are doing).
On ZFS you must plan ahead. You need to know and plan every aspect of the
whole lifecycle of your ZFS drive in advance, before you start with ZFS.
If you can do this, there is no better solution than ZFS, trust me!
But if you forget about something which then happens in future unplanned,
you are doomed. Then you need to restart from scratch with ZFS:
Create a new second fresh and independent ZFS-pool and transfer all data there.
ZFS supports you in doing so. But you cannot evade to duplicate the data
temporarily. Like you need when you introduce ZFS.
Administering ZFS is a breeze. The only thing you need to do regularly is:
zpool scrub
That's all. Then zpool status
tells you how to fix your trouble.
(More than 10 years ZFS now. On Linux. Simply put: ZFS is a lifesaver.)
OTOH on SnapRAID, you do not need any planning. Just go for it.
And change your structure as-you-go, when the need arises.
So you do not need to copy your data to start with SnapRAID.
Just add a parity drive, configure and here you go.
But SnapRAID is far more difficult to administer in case you are in trouble.
You must learn how to use snapraid sync
, snapraid scrub
, snapraid check
and snapraid fix
.
snapraid status
is a help most times, but often you are left puzzling,
what might be the correct way to fix something, as there is no obvious
single best way (SnapRAID is like a swiss army knife, but you need to know
yourself, how to properly handle it correctly).
Note that, on Linux, you have two different choices on ZFS:
- ZFSonLinux, which is a kernel extension.
Newer Kernels which you will see on Ubuntu 20.04 probably will be incompatible.
- ZFS-FUSE, which usually is a bit slower, is independent of the kernel.
- Both have Pros and Cons, this is beyond the scope of this answer.
- If ZFS is not available (perhaps you want need to repair something),
all your data is inaccessible.
- If a device is failing, depending on your redundancy used,
either all data is fully accessible, or all data is lost completely.
SnapRAID is GPLv3 and entirely an addon on userspace.
- If SnapRAID is not available, all your data still is kept intact and accessible.
- If a device is failing, all data on the other devices still is intact and accessible.
Generally a Media Server has the propery of long keeping old data and is ever-growing. This exactly is, what SnapRAID was designed for.
Snapraid allows to add new drives or even new parities later on.
You can mix different filesystem on all drives.
SnapRAID just adds the redundancy.
SnapRAID is not meant as backup.
On Media archives you quite often do not need a backup at all.
ZFS RAIDZ is not meant as a backup either.
However zfs send
in combination with zfs snapshot
offers some
very easy to use 24/7 on-the-fly backup and restore feature.
ZFS is meant for filesystems, where it is crucial that they never have a
downtime. Nearly everything can be fixed on-the-fly without any downtime.
Downtimes only happen in case the redundancy/self-healing of ZFS is no more
capable to repair the damage. But even then, ZFS is more than helpful,
and lists you all your lost data. Usually.
OTOH SnapRAID can recover data, but this is done in an offline fashion.
So until recovered, the data is not available.
It is also helpful to find out which data is lost. But this is more difficult,
than with ZFS.
Best practice recommendation with SnapRAID (ZFS is beyond this answer):
- Stay with LVM!
- Make each drive a full PV. No partition table.
If you want to encrypt the drive, put the PV inside the LUKS container.
- Put each such PV into it's own VG.
This way a failing drive does not harm other VGs.
You can aggregate several smaller drives into similar size
- Create a bunch of similar size LVs on all those VGs.
- Create a bunch of Parity drives which are bigger than the data drives.
- Leave enough room (100GB) on the VGs for creating snapshots
and small adjustments of the filesystems.
- At the end of each PV there should be some free room (see enough room above).
This is for modern filesystems which create superblock copies at the end.
If you fill the PV completely, those FS (ZFS) might detect the whole drive
or the PV instead of the LV.
- Create the FS of your choice for your data drives on those LVs.
- Use ZFS for parity drives on the LVs. No compression/dedup needed here.
Note that this is not perfect, as ZFS ususally creates it's FS in /
.
How to configure and administer SnapRAID is beyond the scope of this answer.
Why ZFS for parity:
Well, when a data-drive goes bad (unreadable sectors) you can copy the readable files. Unreadable files are found this way easily. You then can recover it.
However, SnapRAID parity is just one big file. If you copy this file, you want
to be sure, it has no silent corruption. ZFS ensures this independently of
SnapRAID. In case there is corruption, ZFS tells you so, such that you know
you must check the parity file.
Checking the complete file in case of just some defective sectors takes ages, as all data of all drives must be read in completely.
Why not BTRFS?
- ZFS runs completely stable, silent and flawless for years already.
- OTOH it's 2019 and BTRFS still has trainloads of serious problems.
- For example, BTRFS reacts unpredictable and unstable if you fill it completely,
if you can fill it completely. In contrast no such problems are known with ZFS.
- It's likely that you sometimes hit the a full parity if you are not careful
(SnapRAID is a bit inefficient if it has to put many small files into parity.)
Unfortunately, at this point you basically have two good options:
- Destroy and recreate the pool with the intended configuration, then restore your data from a restoration copy
- Get two more drives (minimum same size as each respective one you already have) and expand your pool to two mirrored pairs instead of two single disks
The latter can be performed in-place, and has the bonus of providing you with additional storage space, but requires you to purchase more hardware (which you said in the question that you don't want to do). The former cannot be done in-place, but provides you with a good opportunity to test your restoration strategy (you do have a restoration strategy, I presume?).
As you have found out, it's not possible to remove a JBOD component in a ZFS pool. By add
ing rather than attach
ing the new drive, you created a JBOD situation with multiple disks.
If you do go with expanding the pool, I suggest strongly considering expanding into raidz2 instead of two mirrored pairs. You get (essentially) the same usable storage capacity, but the ability to survive the failure of any two of the drives, as opposed to only one per pair. You can create a raidz2 vdev with two sparse files and then delete those files before replacing them with drives you are migrating data from, to migrate from your current situation of 2-disk JBOD to 4-disk RAIDZ2 only adding two more disks.
Best Answer
You can add devices to a pool after it has been created, however not really in the way you seem to envision.
With ZFS, the only redundant configuration that you can add devices to is the mirror. It is currently not possible to grow a raidzN vdev with additional devices after it has been created. Adding devices to a mirror increases the redundancy but not the available storage capacity.
It is possible to work around this to some degree, by creating a raidzN vdev of the desired configuration using sparse files for the redundancy devices, then deleting the sparse files before populating the vdev with data. Once you have drives available, you would
zpool replace
the (now non-existent) sparse files with those. The problem with using this approach as more than a migration path toward a more ideal solution is that the pool will constantly show asDEGRADED
which means you have to look much more closely to recognize any actual degredation of the storage; hence, I don't really recommend it as a permanent solution.Naiively adding devices to a ZFS pool actually comes at a serious risk of decreasing the pool's resilience to failure, because all top-level vdevs must be functional in order for the pool to be functional. These top-level vdevs can have redundant configurations, but do not need to; it is perfectly possible to run ZFS in a JBOD-style configuration, in which case a single device failure is highly likely to bring down your pool. (Bad idea if you can avoid it, but still gives you many ZFS capabilities even in a single-drive setup.) Basically, a redundant ZFS pool is made up of a JBOD combination of one or more redundant vdevs; a non-redundant ZFS pool is made up of a JBOD combination of one or more JBOD vdevs.
Adding top-level vdevs also doesn't cause ZFS to balance the data onto the new devices; it eventually happens for data that gets rewritten (because of the file system's copy-on-write nature and favoring vdevs with more free space), but it doesn't happen for data that just sits there and is read but never rewritten. You can make it happen by rewriting the data (for example through use of
zfs send | zfs recv
, assuming deduplication is not turned on for the pool) but it does require you to take specific action.Based on the numbers in your post, you have:
2 × 4TB drives
approximately 8 TB of data
Since you say that you want a redundant configuration, given these constraints (particularly the set of drives available) I'd probably suggest grouping the drives as mirror pairs. That would give you a pool layout like this:
This setup will have a user-accessible storage capacity of approximately 8 TB, give or take metadata overhead (you have two mirrors providing 2 TB each, plus one mirror providing 4 TB, for a total of 8 TB). You can add more mirror pairs later to increase the pool capacity, or replace a pair of 2 TB drives with 4 TB drives (though be aware that resilvering in case of a drive failure in a mirror pair puts severe stress on the remaining drive(s), in the case of two-way mirrors greatly increasing the risk of complete failure of the mirror). The downside of this configuration is that the pool will be practically full right from the beginning, and the general suggestion is to keep ZFS pools below about 75% full. If your data is mostly only ever read, you can get away with less margin, but performance will suffer greatly particularly on writes. If your dataset is write-heavy, you definitely want some margin for the block allocator to work with. So this configuration will "work", for some definition of the word, but will be suboptimal.
Since you can freely add additional mirror devices to a vdev, with some planning it should be possible to do this in such a way that you don't lose any of your data.
You could in principle replace mirror-0 and mirror-1 above with a single raidz1 vdev eventually made up of the four 2 TB HDDs (giving you 6 TB usable storage capacity rather than 4 TB, and the ability to survive any one 2 TB HDD failure before your data is at risk), but that means committing three of those drives initially to ZFS. Given your usage figures it sounds like this might be possible with some shuffling data around. I wouldn't recommend mixing vdevs of different redundancy levels, though, and I think the tools even force you in that case to say effectively "yes, I really know what I'm doing".
Mixing different sized drives in a pool (and especially in a single vdev, except as a migration path to larger-capacity drives) is not really recommended; in both mirror and raidzN vdev configurations, the smallest constituent drive in the vdev determines the vdev capacity. Mixing vdevs of different capacity is doable but will lead to an unbalanced storage setup; however, if most of your data is rarely read, and when read is read sequentially, the latter should not present a major problem.
The best configuration would probably be to get an additional three 4 TB drives, then create a pool made up of a single raidz2 vdev made up of those five 4 TB drives, and effectively retire the 2 TB drives. Five 4 TB drives in raidz2 will give you 12 TB of storage capacity (leaving a good bit of room to grow) and raidz2 gives you the ability to survive the failure of any two of those drives, leaving the mirror setup in the dust in terms of resilience to disk problems. With some planning and data shuffling, it should be easy to migrate to such a setup with no data loss. Five drive raidz2 is also near optimal in terms of storage overhead according to tests performed by one user and published on the ZFS On Linux discussion list back in late April, showing a usable storage capacity at 96.4% of optimal when using 1 TB devices, beaten only by a six-drives-per-vdev configuration which gave 97.3% in the same test.
I do realize that five 4 TB drives might not be practical in a home setting, but keep in mind that ZFS is an enterprise file system, and many of its limitations (particularly in this case, the limitations on growing redundant vdevs after creation) reflect that.
And always remember, no type of RAID is backup. You need both to be reasonably secure against data loss.