What are potential consequences of going nuts with sparse file based vdevs for a LVM to ZFS migration

I've been playing around with ZFS having finally accepted that it's mature enough I shouldn't get burned.

Now to migrate the home NAS box which is currently an LVM JBOD set up that's quite full, but I have quite a bit of space unallocated on just one drive.

I've been experimenting with zfs using zpools created on sparse files. Replacing the file based vdevs with physical partitions seems like a very flexible way forward.

I'm wondering what could happen if I were to take this idea to the extreme, and create sparse file vdevs of the same size as the existing hardware, within this already very full filesystem, and move contents out of the LVM into the ZFS array.

In theory as long as the free space on the LVM filesystem stays enough to accomodate the growing ZFS, as the files are removed from it then it should keep going. Then eventually the LVM filesystem would be empty, apart from the actual ZFS images, which could be replaced with physical partitions.

I expect that there would be a serious performance impact due to fragmentation at the very least.

This sounds totally nuts on the surface, why copy files into another container on the same filesystem. So I believe this data set would benefit from switching on deduplication and compression so could potentially end up with a lot more free space. The main goal though is gaining the flexibility of having the data in a ZFS pool which can have it's 'drives' upgraded, ie replaced with physical hardware

This crazy question was inspired by this blog post on converting the raid level of an array using a loopback device as an interim hard drive replacement.

Best Answer

Putting your zpool as files on an existing file system means you're relying on that file system to provide consistency (which sounds dangerous at best) and also that ZFS can't take good advantage of caching. I'm not sure how well ZFS would handle the transfer from files to physical devices; the file system itself probably wouldn't have any real complaints, but you might run into things like it not liking vdevs going onto smaller devices (from what I've read, a number of people have been bit by this having set autoexpand=on, so you might want to be careful with that property and its cousin autoreplace). Alternatively, you'd be running ZFS on top of LVM, which is probably possible but doesn't allow ZFS to handle the devices intelligently since it'll only see one huge device. Remember that ZFS is not just a file system, it's a volume manager as well, so properly replaces both the regular file system and LVM. Many of its features, including metadata placement on multiple disks and multiple copies of data for redundancy within a zpool, work best when ZFS has a good idea of the physical storage layout.

I've been considering migrating to ZFS as well, and the best option I've been able to come up with for migration involves one more hard disk. Install another hard disk that is at least the size of the smallest physical drive you currently have in the array, make a ZFS pool and file system on it (configured for JBOD but with only one device), and move as much onto it as you can. (Since I'm not running LVM, I'd move everything on the smallest drive onto the ZFS fs.) Reduce the LVM array by removing one physical disk from it, expand the ZFS zpool onto that now-free disk, move some more files, rinse and repeat until done. With clever use of symlinks or good handling of exports, you may even be able to keep the process transparent for anyone who might be using files on that NAS box of yours in the meantime.

Best Answer

Related Solutions

LVM Shrinking on Degraded RAID – Adding Spare and Rebuilding

Related Question