I bought one and configured it as a Raid-1 device. After a short while, I also decided to update the firmware with a version based on OpenMSS.
Shortly after the warranty expired, one of the drives failed badly. The clicking that was coming out of it was pretty loud but in a twisted way also quite pleasant, somehow clicking along with Bob Marley's "Redemption Songs". Anyway, I managed to replace the faulty drive and rebuild the array, and my file server has been living happily ever since... until yesterday.
It was either a power failure or a loose PSU connector, or both. As a result, the power light started flashing alternatively green (once) and amber (once). I went to the diagnostics page only to find that my device was functioning "within normal parameters". Hmmm... that can't be right.
~ # mdadm --detail /dev/md0 /dev/md0: Version : 00.90.01 Creation Time : Sat May 5 06:30:50 2007 Raid Level : raid1 Array Size : 487106752 (464.54 GiB 498.80 GB) Device Size : 487106752 (464.54 GiB 498.80 GB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue May 5 11:18:29 2009 State : active, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 UUID : 7ff7415e:4719112d:d63dd33d:40ff685f Events : 0.515034 Number Major Minor RaidDevice State 0 8 22 0 active sync /dev/sdb6 1 0 0 - removed
What??? Removed??? How???
~ # mdadm --examine /dev/sda6 mdadm: cannot open /dev/sda6: No such file or directory mdadm: cannot find device size for /dev/sda6: No such file or directory
Hmmm...
~ # ls /dev/sd* /dev/sda /dev/sda3 /dev/sda6 /dev/sdb1 /dev/sdb4 /dev/sdb7 /dev/sda1 /dev/sda4 /dev/sda7 /dev/sdb2 /dev/sdb5 /dev/sda2 /dev/sda5 /dev/sdb /dev/sdb3 /dev/sdb6 ~ # cat /proc/partitions major minor #blocks name 8 16 488386584 sdb 8 17 257008 sdb1 8 18 257040 sdb2 8 19 257040 sdb3 8 20 1 sdb4 8 21 506016 sdb5 8 22 487106833 sdb6 8 0 488386584 sdc 8 1 257008 sdc1 8 2 257040 sdc2 8 3 257040 sdc3 8 4 1 sdc4 8 5 506016 sdc5 8 6 487106833 sdc6 31 0 256 mtdblock0 9 0 487106752 md0
How exactly did my sda partitions become sdc? Reboot? Yes, reboot!
... [reboot] ...
~ # cat /proc/partitions major minor #blocks name 8 0 488386584 sda 8 1 257008 sda1 8 2 257040 sda2 8 3 257040 sda3 8 4 1 sda4 8 5 506016 sda5 8 6 487106833 sda6 8 16 488386584 sdb 8 17 257008 sdb1 8 18 257040 sdb2 8 19 257040 sdb3 8 20 1 sdb4 8 21 506016 sdb5 8 22 487106833 sdb6 31 0 256 mtdblock0 9 0 487106752 md0
That's better, but how... ??? Anyway, let's check sda6.
~ # mdadm --query /dev/sda6 /dev/sda6: is not an md array /dev/sda6: device 1 in 2 device mismatch raid1 md0. Use mdadm --examine for more detail. ~ # mdadm --examine /dev/sda6 /dev/sda6: Magic : a92b4efc Version : 00.90.01 UUID : 7ff7415e:4719112d:d63dd33d:40ff685f Creation Time : Sat May 5 06:30:50 2007 Raid Level : raid1 Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Update Time : Fri May 1 20:10:03 2009 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Checksum : 34c79134 - correct Events : 0.513042 Number Major Minor RaidDevice State this 1 8 6 1 active sync /dev/sda6 0 0 8 22 0 active sync /dev/sdb6 1 1 8 6 1 active sync /dev/sda6
Mismatched, as I would expect, but it's clean. Good.
~ # mdadm --manage --add /dev/md0 /dev/sda6 mdadm: hot added /dev/sda6 ~ # mdadm --detail /dev/md0 /dev/md0: Version : 00.90.01 Creation Time : Sat May 5 06:30:50 2007 Raid Level : raid1 Array Size : 487106752 (464.54 GiB 498.80 GB) Device Size : 487106752 (464.54 GiB 498.80 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue May 5 11:22:02 2009 State : clean, degraded, recovering Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 Rebuild Status : 0% complete UUID : 7ff7415e:4719112d:d63dd33d:40ff685f Events : 0.515210 Number Major Minor RaidDevice State 0 8 22 0 active sync /dev/sdb6 1 0 0 - removed 2 8 6 1 spare rebuilding /dev/sda6
Rebuilding. Good sign, but why do I stil have device 1 - removed - in the list?
~ # cat /proc/mdstat Personalities : [linear] [raid1] md0 : active raid1 sda6[2] sdb6[0] 487106752 blocks [2/1] [U_] [=>...................] recovery = 9.8% (47870464/487106752) finish=114.8min speed=63713K/sec unused devices: none
Under 2 hours to sync up. Time for coffee.
... [coffee] ...
~ # cat /proc/mdstat Personalities : [linear] [raid1] md0 : active raid1 sda6[1] sdb6[0] 487106752 blocks [2/2] [UU] unused devices: none ~ # mdadm --detail /dev/md0 /dev/md0: Version : 00.90.01 Creation Time : Sat May 5 06:30:50 2007 Raid Level : raid1 Array Size : 487106752 (464.54 GiB 498.80 GB) Device Size : 487106752 (464.54 GiB 498.80 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue May 5 14:05:25 2009 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 7ff7415e:4719112d:d63dd33d:40ff685f Events : 0.515939 Number Major Minor RaidDevice State 0 8 22 0 active sync /dev/sdb6 1 8 6 1 active sync /dev/sda6
One last reboot and we're back on track.
Sorted.
No comments:
Post a Comment