Have a look at this question. I assume that is familiar to your problem.
Recreating and even syncing a RAID-1 should not destroy data. Obviously the MD device starts at another offset now. Thus where mount looks for a superblock there is data. This can have happened in at least two ways:
- You (or rather: the default setting) have created the new array with a different superblock format (see
--metadata
in man mdadm
). Thus the superblock is in another position (or has a different size) now. Do you happen to know what the old metadata format was?
- The offset has changed even with the same format due to a different default offset. See
mdadm --examine /dev/sdb1
(add the output to your question).
You should look for a superblock in the first area of the disks (/dev/sdb1). Maybe this can be done with parted
or similar tools. You may have to delete the respective partitions for that though (no problem as you can easily backup and restore the partition table).
Or you create loop devices / DM devices with increasing offsets (not necissarily over the whole disk, a few MiB are enough) and try dumpe2fs -h
on them. If you want to do this but don't know how then I can provide some shell code for that.
The worst case would be that the new MD superblock has overwritten the file system superblock. In that case you may search for superblock copies (see the output of mke2fs
). A mke2fs
run on a dummy device of the same size may tell you the positions of the superblock copies.
Edit 1:
Now I have read and understood your dumpe2fs
output. Your old RAID-1 had its superblock at the end (0.9 or 1.0). Now you probably have 1.2 so that a part of your file system has been overwritten. I cannot assess how big the damage may be. This is a case for e2fsck
. But first you should reset the RAID to its old type. Would help to know the old version.
You can reduce the risk by putting DM devices over the complete /dev/sdb1
and /dev/sdc1
, create snapshots for them (with dmsetup directly
and create the new array over the snapshots. That way the relevant parts of your disks are not written. From the dumpe2fs
output we know that the MD device must be 1000202174464 bytes in size. This should be checked at once after a test creation
I hate to be the bearer of bad news, but...
Q: I'm new to mdadm, did I do everything correctly?
A: No. In fact, you did just about everything in the most destructive way possible. You used --create
to destroy the array metadata, instead of using --assemble
which probably would have allowed you to read the data (at least, to the extent the disk is capable of doing so). In doing so, you have lost critical metadata (in particular, the disk order, data offset, and chunk size).
In addition, --create
may have scribbled array metadata on top of critical filesystem structures.
Finally, in your step (3), I see that mdadm is complaining of RAID1 on both disks—I'm hoping that's from you trying (2) on both disks, individually. I sincerely hope you didn't let RAID1 start trying to sync the disks (say, had you added both to the same RAID1 array).
What to do now
It seems like you've finally created images of the drives. You ought to have done this first, at least before trying anything beyond a basic --assemble
. But anyway,
If the image of the bad drive missed most/all sectors, determine if professional data recovery is worthwhile. Files (and filesystem metadata) are split across drives in RAID0, so you really need both to recover. Professional recovery will probably be able to read the drive.
If the image is mostly OK, except for a few sectors, continue.
Make a copy of the image files. Only work on the copies of the image files. I can not emphasize this enough, you will likely be destroying these copies several times, you need to be able to start over. And you don't want to have to image the disks again, especially since one is failing!
To answer one of your other questions:
Q: Why I cannot use the partition images to create an array?
A: To assemble (or create) an array of image files, you need to use a loopback device. You attach an image to a loopback device using losetup
. Read the manpage, but it'll be something along the lines of losetup --show -f /path/to/COPY-of-image
. Now, you use mdadm
on the loop devices (e.g., /dev/loop0
).
Determine the original array layout
You need to find out all the mdadm options that were originally used to create the array (since you destroyed that metadata with --create
earlier). You then get to run --create
on the two loopback devices, with those options, exactly. You need to figure out the metadata version (-e
), the RAID level (-l
, appears to be 0), the chunk size (-c
), number of devices (-n
, should be 2) and the exact order of the devices.
The easiest way to get this is going to be to get two new disks, put then in the NAS, and have the NAS create a new array on them. Preferably with the same NAS firmware version as originally used. IOW, repeat the initial set up. Then pull the disks out, and use mdadm -E
on one of the members. Here is an example from a RAID10 array, so slightly different. I've omitted a bunch of lines to highlight the ones you need:
Version : 1.0 # -e
Raid Level : raid10 # -l
Raid Devices : 4 # -n
Chunk Size : 512K # -c
Device Role : Active device 0 # gets you the device order
Array State : AAAA ('A' == active, '.' == missing)
NOTE: I'm going to assume you're using ext2/3/4 here; if not, use the appropriate utilities for the filesystem the NAS actually used.
Attempt a create (on the loopback devices) with those options. See if e2fsck -n
even recognizes it. If not, stop the array, and create it again with the devices in the other order. Try e2fsck -n
again.
If neither work, you should go back to the order you think is right, and try a backup superblock. The e2fsck
manpage tells you what number to use; you almost certainly have a 4K blocksize. If none of the backup superblocks work, stop the array, and try the other disk order. If that doesn't work, you probably have the wrong --create
options; start over with new copy of the images & try some different options—I'd try different metadata versions first.
Once you get e2fsck to run, see how badly damaged the filesystem is. If its completely trashed, that may mean you have the wrong chunk size (stop and re-create the array to try some more).
Copy the data off.
I suggest letting e2fsck try to fix the filesystem. This does risk destroying the filesystem, but, well, that's why you're working on copies! Then you can mount it, and copy the data off. Keep in mind that some of the data is likely corrupted, and that corruption may be hidden (e.g., a page of a document could have been replaced with NULLs).
I can't get the original parameters from the NAS
Then you're in trouble. Your other option is to take guesses until one finally works, or to learn enough about the on-disk formats to figure it out using a hex editor. There may be a utility or two out there to help with this; I don't know.
Alternatively, hire a data recovery firm.
Best Answer
You may try using PhotoRec with your corrupt image file. It can recover a lot of file types, not just photos as the name may imply.
I have used PhotoRec successfully even when I could no longer list the partitions from an image of a broken HDD.
http://www.cgsecurity.org/wiki/PhotoRec