MDADM – how to reassemble RAID-5 (reporting device or resource busy)

mdadm

I rather new to the Linux scene, and don't have nearly enough experience to actually consider myself somebody that can be trusted with using the system 😛

Anyhow long story short – I decided to use Linux RAID 5 as I consider it more stable than getting it to run on Windows.
The RAID recently failed to mount, and I am rather sure it encountered a issue while trying to rebuild.

Trying to assemble the array now, mdadm keeps reporting device or resource busy – and yet its not mounted or busy with anything to my knowledge. Google reported that dmraid is a possible culprit – but trying to remove it shows it is not installed.

System is a 12 drive RAID-5, but it seems 2 of the drives is not having the correct superblock data installed.

I have included output from most of the common commands below


cat /proc/mdstat

erwin@erwin-ubuntu:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdd1[10](S) sde1[2](S) sdf1[11](S) sdg1[6](S) sdm1[4](S) sdl1[9](S) sdk1[5](S) sdj1[7](S) sdi1[13](S) sdc1[8](S) sdb1[0](S) sda1[3](S)
     11721120064 blocks

unused devices: <none>

detail mdadm


erwin@erwin-ubuntu:~$ sudo mdadm --detail /dev/md0
mdadm: md device /dev/md0 does not appear to be active.
erwin@erwin-ubuntu:~$

mdadm examine

Strange part noted – I am not sure why, but system drive usually was sda – now all of a sudden it is sdh – and nope, I didn't move any physical wiring?


erwin@erwin-ubuntu:~$ sudo mdadm --examine /dev/sd*1
/dev/sda1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1bcd - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       97        3      active sync   /dev/sdg1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
/dev/sdb1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1bd7 - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8      113        0      active sync   /dev/sdh1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
/dev/sdc1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1bf7 - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     8       8      129        8      active sync   /dev/sdi1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
/dev/sdd1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1c0b - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this    10       8      145       10      active sync   /dev/sdj1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
/dev/sde1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 08:05:07 2011
          State : clean
 Active Devices : 11
Working Devices : 12
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 3597cbb - correct
         Events : 74284

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8      161        2      active sync   /dev/sdk1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       8      161        2      active sync   /dev/sdk1
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8       17       12      spare   /dev/sdb1
/dev/sdf1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1c2d - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this    11       8      177       11      active sync   /dev/sdl1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
/dev/sdg1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1c33 - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     6       8      193        6      active sync   /dev/sdm1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
mdadm: No md superblock detected on /dev/sdh1.
/dev/sdi1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1b8b - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this    13       8       17       13      spare   /dev/sdb1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
/dev/sdj1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1b95 - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     7       8       33        7      active sync   /dev/sdc1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
/dev/sdk1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1ba1 - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     5       8       49        5      active sync   /dev/sdd1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
/dev/sdl1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1bb9 - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     9       8       65        9      active sync   /dev/sde1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1
/dev/sdm1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7964c122:1ec1e9ff:efb010e8:fc8e0ce0 (local to host erwin-ubuntu)
  Creation Time : Sun Oct 10 11:54:54 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 10744359296 (10246.62 GiB 11002.22 GB)
   Raid Devices : 12
  Total Devices : 12
Preferred Minor : 0

    Update Time : Mon Dec  5 19:24:00 2011
          State : clean
 Active Devices : 10
Working Devices : 11
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 35a1bbf - correct
         Events : 74295

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8       81        4      active sync   /dev/sdf1

   0     0       8      113        0      active sync   /dev/sdh1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       97        3      active sync   /dev/sdg1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       49        5      active sync   /dev/sdd1
   6     6       8      193        6      active sync   /dev/sdm1
   7     7       8       33        7      active sync   /dev/sdc1
   8     8       8      129        8      active sync   /dev/sdi1
   9     9       8       65        9      active sync   /dev/sde1
  10    10       8      145       10      active sync   /dev/sdj1
  11    11       8      177       11      active sync   /dev/sdl1
  12    12       8      161       12      faulty   /dev/sdk1

mdadm –assemble –scan –verbose
– acapture truncated to save characters – as noted in edit – resource busy was solved by stopping the array first – yes as simple as that


erwin@erwin-ubuntu:~$ sudo mdadm --assemble --scan --verbose
mdadm: looking for devices for /dev/md0
mdadm: cannot open device /dev/sdm1: Device or resource busy
mdadm: /dev/sdm1 has wrong uuid.

My feeling is that I need to probably zero the superblock on the two faulty drives (since the one drive is shown as a spare, and the other well – disk number just doesn't match?) – then it needs to be reassembled, but I'm not sure what to do about the resource busy.

I don't want to take unnecessary, possible data damaging steps – so any advise will be greatly appreciated.

1

derobert suggested to stop the array and then to reassemble it
😀
Yay resource busy has been fixed, but it still seems that two drives is not co-operating. I am guessing a manual assemble/recreate is in order?

Any ideas welcome for the next step?

Latest output from mdadm assemble listed below:

erwin@erwin-ubuntu:~$ sudo mdadm --assemble --scan --verbose
mdadm: looking for devices for /dev/md0
mdadm: no RAID superblock on /dev/sdm
mdadm: /dev/sdm has wrong uuid.
mdadm: no RAID superblock on /dev/sdl
mdadm: /dev/sdl has wrong uuid.
mdadm: no RAID superblock on /dev/sdk
mdadm: /dev/sdk has wrong uuid.
mdadm: no RAID superblock on /dev/sdj
mdadm: /dev/sdj has wrong uuid.
mdadm: no RAID superblock on /dev/sdi
mdadm: /dev/sdi has wrong uuid.
mdadm: cannot open device /dev/sdh6: Device or resource busy
mdadm: /dev/sdh6 has wrong uuid.
mdadm: no RAID superblock on /dev/sdh5
mdadm: /dev/sdh5 has wrong uuid.
mdadm: no RAID superblock on /dev/sdh4
mdadm: /dev/sdh4 has wrong uuid.
mdadm: no RAID superblock on /dev/sdh3
mdadm: /dev/sdh3 has wrong uuid.
mdadm: no RAID superblock on /dev/sdh2
mdadm: /dev/sdh2 has wrong uuid.
mdadm: no RAID superblock on /dev/sdh1
mdadm: /dev/sdh1 has wrong uuid.
mdadm: cannot open device /dev/sdh: Device or resource busy
mdadm: /dev/sdh has wrong uuid.
mdadm: no RAID superblock on /dev/sdg
mdadm: /dev/sdg has wrong uuid.
mdadm: no RAID superblock on /dev/sdf
mdadm: /dev/sdf has wrong uuid.
mdadm: no RAID superblock on /dev/sde
mdadm: /dev/sde has wrong uuid.
mdadm: no RAID superblock on /dev/sdd
mdadm: /dev/sdd has wrong uuid.
mdadm: no RAID superblock on /dev/sdc
mdadm: /dev/sdc has wrong uuid.
mdadm: no RAID superblock on /dev/sdb
mdadm: /dev/sdb has wrong uuid.
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sdm1 is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdl1 is identified as a member of /dev/md0, slot 9.
mdadm: /dev/sdk1 is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdj1 is identified as a member of /dev/md0, slot 7.
mdadm: /dev/sdi1 is identified as a member of /dev/md0, slot 13.
mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 6.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 11.
mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 10.
mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 8.
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 3.
mdadm: no uptodate device for slot 1 of /dev/md0
mdadm: added /dev/sde1 to /dev/md0 as 2
mdadm: added /dev/sda1 to /dev/md0 as 3
mdadm: added /dev/sdm1 to /dev/md0 as 4
mdadm: added /dev/sdk1 to /dev/md0 as 5
mdadm: added /dev/sdg1 to /dev/md0 as 6
mdadm: added /dev/sdj1 to /dev/md0 as 7
mdadm: added /dev/sdc1 to /dev/md0 as 8
mdadm: added /dev/sdl1 to /dev/md0 as 9
mdadm: added /dev/sdd1 to /dev/md0 as 10
mdadm: added /dev/sdf1 to /dev/md0 as 11
mdadm: added /dev/sdi1 to /dev/md0 as 13
mdadm: added /dev/sdb1 to /dev/md0 as 0
mdadm: /dev/md0 assembled from 10 drives and 1 spare - not enough to start the array.

Best Answer

First off, drive re-lettering just happens sometimes, depending on how your machine is set up. Drive letters aren't expected to be stable over reboots since, ummm, a while. So it isn't a huge concern that your drive moved on you.

Assuming dmraid and device-mapper aren't using your devices:

Well, mdadm --stop /dev/md0 might take care of your busy messages, I think that's why its complaining. Then you can try your assemble line again. If it doesn't work, --stop again followed by assemble with --run (without run, --assemble --scan won't start a degraded array). Then you can remove and re-add your failed disk to let it attempt a rebuild.

/dev/sde is outdated (look at the events counter). The others look OK at first glance, so I think you actually have a pretty good chance of no difficulties.

You shouldn't zero any superblocks yet. Way too high risk of data loss. If --run doesn't work, I think you're going to want to find someone locally (or who can ssh in) who knows what he/she is doing to attempt to fix.

In response to Update 1

That "not enough to start the array" is never a good message to get from mdadm. What it means is that mdadm has found 10 drives out of your 12-drive RAID5 array, and as I hope you're aware RAID5 can only survive one failure, not two.

Well, let's try and piece together what happened. First, over reboot, there was a drive letter change, which is annoying for us trying to figure it out, but mdraid doesn't care about that. Reading through your mdadm output, here is the remap that happened (sorted by the raid disk #):

00 sdh1 -> sdb1
02 sdk1 -> sde1 [OUTDATED]
03 sdg1 -> sda1
04 sdf1 -> sdm1
05 sdd1 -> sdk1
06 sdm1 -> sdg1
07 sdc1 -> sdj1
08 sdi1 -> sdc1
09 sde1 -> sdl1
10 sdj1 -> sdd1
11 sdl1 -> sdf1
13 sdb1 -> sdi1 [SPARE]

#02 has a lower 'events' counter than the others. That means it left the array at some point.

It'd be nice if you know some of the history of this array—e.g., is "12-drive RAID5, 1 hot spare" correct?

I'm not quite sure what the sequence of failures that lead up to this is, though. It appears that at some point, device #1 failed, and a rebuild onto device #12 started.

But I can't make out exactly what happened next. Maybe you have logs—or an administrator to ask. Here is what I can't explain:

Somehow, #12 became #13. Somehow, #2 became #12.

So, that rebuild onto #12 should have finished and then #12 would be #1. Maybe it didn't—maybe it failed to rebuild for some reason. Then maybe #2 failed—or maybe #2 failed, is why the rebuild didn't finish, and someone tried removing and re-adding #2? That might make it #12. Then maybe removed and re-added the spare, making it #13.

Ok, but of course, at this point, you'd had a two-disk failure. Ok. That makes sense.

If this is what has happened, you've suffered a two-disk failure. That means you've lost data. What you do next, depends on how important that data is (considering also how good your backups are).

If the data is very valuable (and you don't have good backups), contact data recovery specialists. Otherwise:

If the data is valuable enough, you should use dd to image all the disks involved (you can use larger disks, and files on each to save money. 2 or 3 TB externals, for example). Then make a copy of the images. Then work on recovering on that copy (you can use loop devices to do this).

Obtain more spares. Probably, you have one dead disk. You have at least a few questionable disks—smartctl may be able to tell you more.

Next --force to your --assemble line. This will make mdadm use the outdated disk anyway. This means some sectors will now have outdated data, some won't. Add in one of those new disks as a spare, let the rebuild finish. Hopefully you don't hit any bad blocks (which would cause the rebuild to fail, and I believe the only answer is to make the disk map them out) Next, fsck -f the disk. There will probably be errors. Once they're fixed, mount the disk, and see what shape your data is in.

Recommendations

In the future, do not build 12-disk RAID5s. The probability of two-disk failure is too high. Use RAID6 or RAID10 instead. Also, make sure to routinely scrub your arrays for bad blocks (echo check > /sys/block/md0/md0/sync_action).

Related Question