RAID-like filesystem for heterogeneous set of hard disks

filesystemsredundancystorage

During the last years I have accumulated a quite heterogeneous set of hard disks of different sizes and speeds for storing my private data. I'm planning to put them in a self build Linux filer to reduce the hassle of replicating local data manually, reducing duplicated files and better utilize the given resources. Also I'm expecting that in the coming years my storage need will increase. Therefore it should be possible to dynamically add disks and also removing single drives to replace them with newer and larger ones.

If I'm not mistaken the most common option for building a filer would be using a (software) RAID to increase reliability and an additional backup scheme on external or removable drives to prevent accidental loss of important data.

As RAID5/RAID6 needs drives of equal size it doesn't satisfy my need for a more dynamic disk addition/removal scheme. So I'm looking for a FLOSS file system or block device abstraction layer that provides:

  • dynamical addition and removal of hard disks possible
  • replication/redundancy similar to RAID5, maybe adjustable per file or directory
  • no additional (especially dedicated) machines necessary (but the possibility to add a second machine later if needed would be nice but is not necessary)

I looked a bit in distributed file systems like XtremeFS, but haven't found one yet that satisfies all points and work well on a single machine. Do you have an idea what could be a solution?

Best Answer

To put it simply, believe it or not, hardware solution are easier to rebuilt and more reliable than software solution, and existing solution are often less costly to implement.

Building your storage system is generally not a good idea unless you are very well versed in the art of storage management. In general, the old dictum "If you had to ask it's not for you" applies.

Bottom line: I would recommend buying new harddisks (These days those are cheap compared with my data - if your data isn't worth buying a set of harddisk these days, Houston we've had a problem.) and setting up a brand new raid5/6 to handle it, together with some cold replacements.

More importantly, note that backup and redundancy is two separate entity in storage management. Redundancy is for maintaining the service ONLINE in case of outage of one single piece of equipment, and backup is in case the whole thing fail. For example, if the power supply suddenly fails and try to pump some 100V AC onto the 12V rail (Not that it is likely, but...) nevermind RAID5, nevermind RAID6 - all would be gone. You will need backups to handle those situation. For backup, follow these simple rules:

  1. Take backups offline
  2. Take backups offsite
  3. Do backups often

Good luck!

Related Question