How to recover a broken software RAID5 array

mdadmsoftware-raid

Ack, I think I screwed up bad. I had a power failure on a RAID5 array which caused the metadata to be inconsistent between drives.

I followed the advice here except a case of fat fingers caused me to run the –create command without a missing parameter. Running the Perl script that permutates through the various combination leads me unable to mount an array. I have several questions:

Am I totally screwed here, is there anything else I can do? The drives themselves seem to be fine. Does the missing parameter blow out any chance I have of recovering my data?
Is there anyway I can get some data, but mounting the drives? I'm not that familiar with how data is striped across disks so I don't know if it possible to recover some files.

Best Answer

I'm sorry, but you've just hit the very common problem known as "write hole". In short words you do not have any chance to recover your array. More information on Wikipedia: http://en.wikipedia.org/wiki/RAID_5_write_hole

Expensive RAID controllers are equipped with batteries to avid this problem.

I hope you have a backup, that's the last chance of yours.

Related Solutions

Linux – How to recover a crashed Linux md RAID5 array

First check the disks, try running smart selftest

for i in a b c d; do
    smartctl -s on -t long /dev/sd$i
done

It might take a few hours to finish, but check each drive's test status every few minutes, i.e.

smartctl -l selftest /dev/sda

If the status of a disk reports not completed because of read errors, then this disk should be consider unsafe for md1 reassembly. After the selftest finish, you can start trying to reassembly your array. Optionally, if you want to be extra cautious, move the disks to another machine before continuing (just in case of bad ram/controller/etc).

Recently, I had a case exactly like this one. One drive got failed, I re-added in the array but during rebuild 3 of 4 drives failed altogether. The contents of /proc/mdadm was the same as yours (maybe not in the same order)

md1 : inactive sdc2[2](S) sdd2[4](S) sdb2[1](S) sda2[0](S)

But I was lucky and reassembled the array with this

mdadm --assemble /dev/md1 --scan --force

By looking at the --examine output you provided, I can tell the following scenario happened: sdd2 failed, you removed it and re-added it, So it became a spare drive trying to rebuild. But while rebuilding sda2 failed and then sdb2 failed. So the events counter is bigger in sdc2 and sdd2 which are the last active drives in the array (although sdd didn't have the chance to rebuild and so it is the most outdated of all). Because of the differences in the event counters, --force will be necessary. So you could also try this

mdadm --assemble /dev/md1 /dev/sd[abc]2 --force

To conclude, I think that if the above command fails, you should try to recreate the array like this:

mdadm --create /dev/md1 --assume-clean -l5 -n4 -c64 /dev/sd[abc]2 missing

If you do the --create, the missing part is important, don't try to add a fourth drive in the array, because then construction will begin and you will lose your data. Creating the array with a missing drive, will not change its contents and you'll have the chance to get a copy elsewhere (raid5 doesn't work the same way as raid1).

If that fails to bring the array up, try this solution (perl script) here Recreating an array

If you finally manage to bring the array up, the filesystem will be unclean and probably corrupted. If one disk fails during rebuild, it is expected that the array will stop and freeze not doing any writes to the other disks. In this case two disks failed, maybe the system was performing write requests that wasn't able to complete, so there is some small chance you lost some data, but also a chance that you will never notice it :-)

edit: some clarification added.

How to check ‘mdadm’ RAIDs while running

The point of RAID with redundancy is that it will keep going as long as it can, but obviously it will detect errors that put it into a degraded mode, such as a failing disk. You can show the current status of an array with mdadm -D:

# mdadm -D /dev/md0
<snip>
       0       8        5        0      active sync   /dev/sda5
       1       8       23        1      active sync   /dev/sdb7

Furthermore the return status of mdadm -D is nonzero if there is any problem such as a failed component (1 indicates an error that the RAID mode compensates for, and 2 indicates a complete failure).

You can also get a quick summary of all RAID device status by looking at /proc/mdstat. You can get information about a RAID device in /sys/class/block/md*/md/* as well; see Documentation/md.txt in the kernel documentation. Some /sys entries are writable as well; for example you can trigger a full check of md0 with echo check >/sys/class/block/md0/md/sync_action.

In addition to these spot checks, mdadm can notify you as soon as something bad happens. Make sure that you have MAILADDR root in /etc/mdadm.conf (some distributions (e.g. Debian) set this up automatically). Then you will receive an email notification as soon as an error (a degraded array) occurs.

Make sure that you do receive mail send to root on the local machine (some modern distributions omit this, because they consider that all email goes through external providers — but receiving local mail is necessary for any serious system administrator). Test this by sending root a mail: echo hello | mail -s test root@localhost. Usually, a proper email setup requires two things:

Run an MTA on your local machine. The MTA must be set up at least to allow local mail delivery. All distributions come with suitable MTAs, pick anything (but not nullmailer if you want the email to be delivered locally).
Redirect mail going to system accounts (at least root) to an address that you read regularly. This can be your account on the local machine, or an external email address. With most MTAs, the address can be configured in /etc/aliases; you should have a line like
```
root: djsmiley2k
```
for local delivery, or
```
root: djsmiley2k@mail-provider.example.com
```
for remote delivery. If you choose remote delivery, make sure that your MTA is configured for that. Depending on your MTA, you may need to run the newaliases command after editing /etc/aliases.

Best Answer

Related Solutions

Linux – How to recover a crashed Linux md RAID5 array

How to check ‘mdadm’ RAIDs while running

Related Question