Just out of curiosity, I see if there is a hacked firmware based on a more recent image, and I find one based on version 3.4.90, with SSH of course.
So here we go. Let's check the RAID device...
~ # mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.01
Creation Time : Sat May 5 06:30:50 2007
Raid Level : raid1
Array Size : 487106752 (464.54 GiB 498.80 GB)
Device Size : 487106752 (464.54 GiB 498.80 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Fri May 22 15:20:30 2009
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : 7ff7415e:4719112d:d63dd33d:40ff685f
Events : 0.525126
Number Major Minor RaidDevice State
0 0 0 - removed
1 8 6 1 active sync /dev/sda6
Again only one drive out of two.
Let's see what happened to /dev/sdb.
~ # /usr/sbin/smartctl -l selftest /dev/sdb smartctl version 5.1-14 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log, version number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended off-line Completed 00% 7474 - # 2 Off-line Interrupted (host reset) 50% 7466 - # 3 Off-line Interrupted (host reset) 50% 7379 - # 4 Short off-line Completed: read failure 50% 7334 0x00032141 # 5 Off-line Interrupted (host reset) 00% 7334 - # 6 Short off-line Completed 00% 7330 - # 7 Off-line Interrupted (host reset) 00% 7330 - # 8 Off-line Interrupted (host reset) 00% 5973 - # 9 Off-line Interrupted (host reset) 00% 5396 - #10 Off-line Interrupted (host reset) 00% 5393 - #11 Off-line Interrupted (host reset) 00% 5376 - #12 Short off-line Completed 00% 4687 - #13 Off-line Interrupted (host reset) 00% 4687 - #14 Off-line Interrupted (host reset) 00% 4003 - #15 Off-line Interrupted (host reset) 00% 3819 - #16 Short off-line Completed 00% 3659 - #17 Short off-line Completed 00% 3659 - #18 Short off-line Completed 00% 3655 - #19 Off-line Interrupted (host reset) 70% 3652 - #20 Short off-line Aborted by host 70% 3652 - #21 Off-line Interrupted (host reset) 00% 3651 -
Ouch! LBA_of_first_error = 0x32141 (= 205121 in base 10)
Let's check also the SMART attributes.
~ # /usr/sbin/smartctl -A /dev/sdb smartctl version 5.1-14 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 32 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time 0x0027 168 162 063 Pre-fail Always - 18676 4 Start_Stop_Count 0x0032 210 210 000 Old_age Always - 20884 5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail Always - 0 7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0 8 Seek_Time_Performance 0x0027 247 243 187 Pre-fail Always - 41160 9 Power_On_Hours 0x0032 232 232 000 Old_age Always - 7559 10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail Always - 0 11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 253 253 000 Old_age Always - 76 189 Unknown_Attribute 0x003a 100 100 000 Old_age Always - 0 190 Unknown_Attribute 0x0022 056 039 000 Old_age Always - 959119404 192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0 194 Temperature_Celsius 0x0032 046 253 000 Old_age Always - 44 195 Hardware_ECC_Recovered 0x000a 252 210 000 Old_age Always - 37129 196 Reallocated_Event_Count 0x0008 253 253 000 Old_age Offline - 0 197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline - 0 198 Offline_Uncorrectable 0x0008 253 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age Offline - 0 200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0 201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always - 0 202 Unknown_Attribute 0x000a 253 252 000 Old_age Always - 0 203 Unknown_Attribute 0x000b 253 252 180 Pre-fail Always - 11 204 Unknown_Attribute 0x000a 253 252 000 Old_age Always - 0 205 Unknown_Attribute 0x000a 253 252 000 Old_age Always - 0 207 Unknown_Attribute 0x002a 253 252 000 Old_age Always - 0 208 Unknown_Attribute 0x002a 253 252 000 Old_age Always - 0 210 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0 211 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0 212 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0
Not too bad after all, since
Current_Pending_Sector = 0
Offline_Uncorrectable = 0
Now let's find which partition has the problem.
~ # fsck.ext3 -nv /dev/sdb1
e2fsck 1.38 (30-Jun-2005)
/dev/sdb1: clean, 3045/64256 files, 24511/64252 blocks
~ # fsck.ext3 -nv /dev/sdb2
e2fsck 1.38 (30-Jun-2005)
Warning! /dev/sdb2 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
/dev/sdb2 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found. Create? no
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (39452, counted=39449).
Fix? no
Free inodes count wrong (61113, counted=61112).
Fix? no
/dev/sdb2: ********** WARNING: Filesystem still has errors **********
2887 inodes used (4%)
13 non-contiguous inodes (0.5%)
# of inodes with ind/dind/tind blocks: 156/0/0
24548 blocks used (38%)
0 bad blocks
1 large file
2309 regular files
175 directories
47 character device files
40 block device files
0 fifos
8 links
308 symbolic links (308 fast symbolic links)
0 sockets
--------
2887 files
/dev/sdb3 is a swap partition, so we can skip that.
/dev/sdb4 is an extended partition, so we can skip that too.
~ # fsck.ext3 -nv /dev/sdb5
e2fsck 1.38 (30-Jun-2005)
Warning! /dev/sdb5 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
/dev/sdb5 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found. Create? no
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (118269, counted=118286).
Fix? no
Free inodes count wrong (126523, counted=126537).
Fix? no
/dev/sdb5: ********** WARNING: Filesystem still has errors **********
69 inodes used (0%)
4 non-contiguous inodes (5.8%)
# of inodes with ind/dind/tind blocks: 0/0/0
8235 blocks used (6%)
0 bad blocks
1 large file
32 regular files
14 directories
0 character device files
0 block device files
0 fifos
0 links
0 symbolic links (0 fast symbolic links)
0 sockets
--------
46 files
~ # fsck.ext3 -nv /dev/sdb6
e2fsck 1.38 (30-Jun-2005)
/dev/sdb6: clean, 93575/60899328 files, 56875141/121776688 blocks
So the errors are on /dev/sdb2 and /dev/sdb5
Let's see where they mount to.
~ # mount | grep /sdb /dev/sdb1 on /mnt/__mxo_sdb1 type ext3 (rw) ~ # cat /proc/mounts | grep /sdb /dev/sdb5 /tmp ext3 rw 0 0 ~ # cat /proc/cmdline console=ttyS0,115200 root=/dev/sdb2 rw
Are we booting from /dev/sdb2?
~ # mxoparam -h
Maxtor mxoparam version 1.0
-a show all maxtor params
-b get wait for button status
-c [0-1] set wait for button 0 = Off 1 = On
-d show max number of drives
-e enable watchdog in uboot
-f disable watchdog in uboot
-g set led solid green
-h show help
-k kick watchdog
-p get boot partition
-q [part] set boot partition
0 = drive 0 partition 1
1 = drive 0 partition 2
2 = drive 1 partition 1
3 = drive 1 partition 2
-r reset partion fail count
-s get serial number
-t [sn] set serial number
-v show version
-x disable watchdog now
-w enable watchdog now
-y set led solid yellow
~ # mxoparam -p
Boot partition is 3
Looks like the system is booting from the second disk, second partition (/dev/sdb2)This means we can't unmount it, and we need to unmount it before we can fix it.
Therefore, we need to make the system boot from /dev/sda2 otherwise we won't be able to fix /dev/sdb*
First of all, let's make sure /dev/sda2 is exactly the same as /dev/sdb2
~ # dd if=/dev/sdb2 of=/dev/sda2 ~ # mount -n /dev/sda2 /mnt/__mxo_sda2 -t ext3 ~ # cp -a /mnt/__mxo_sdb2 /mnt/__mxo_sda2 Now let's set the new boot partition~ # mxoparam -q 1REBOOT! ... Let's check it's booting up from the right place now.~ # cat /proc/cmdline console=ttyS0,115200 root=/dev/sda2 rwRight! We're ready to fix /dev/sdb2 and /dev/sdb5 now!~ # fsck -v /dev/sdb2 fsck 1.38 (30-Jun-2005) e2fsck 1.38 (30-Jun-2005) /dev/sdb2 has gone 384 days without being checked, check forced. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity /lost+found not found. CreateNow we can rebuild the RAID array.? yes Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/sdb2: ***** FILE SYSTEM WAS MODIFIED ***** 3157 inodes used (4%) 14 non-contiguous inodes (0.4%) # of inodes with ind/dind/tind blocks: 159/0/0 24622 blocks used (38%) 0 bad blocks 1 large file 2317 regular files 179 directories 246 character device files 84 block device files 0 fifos 7 links 321 symbolic links (321 fast symbolic links) 0 sockets -------- 3154 files ~ # fsck -v /dev/sdb5 fsck 1.38 (30-Jun-2005) e2fsck 1.38 (30-Jun-2005) /dev/sdb5: recovering journal /dev/sdb5 has been mounted 35 times without being checked, check forced. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity /lost+found not found. Create ? yes Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/sdb5: ***** FILE SYSTEM WAS MODIFIED ***** 2939 inodes used (2%) 28 non-contiguous inodes (1.0%) # of inodes with ind/dind/tind blocks: 159/0/0 26690 blocks used (21%) 0 bad blocks 1 large file 2345 regular files 189 directories 47 character device files 40 block device files 0 fifos 8 links 308 symbolic links (308 fast symbolic links) 0 sockets -------- 2937 files ~ # mdadm --manage --add /dev/md0 /dev/sdb6 mdadm: hot added /dev/sdb6 NAS:~ # mdadm --detail /dev/md0 /dev/md0: Version : 00.90.01 Creation Time : Sat May 5 06:30:50 2007 Raid Level : raid1 Array Size : 487106752 (464.54 GiB 498.80 GB) Device Size : 487106752 (464.54 GiB 498.80 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Sat May 23 14:39:50 2009 State : clean, degraded, recovering Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 Rebuild Status : 0% complete UUID : 7ff7415e:4719112d:d63dd33d:40ff685f Events : 0.541445 Number Major Minor RaidDevice State 0 0 0 - removed 1 8 6 1 active sync /dev/sda6 2 8 22 0 spare rebuilding /dev/sdb6 ~ # cat /proc/mdstat Personalities : [linear] [raid1] md0 : active raid1 sdb6[2] sda6[1] 487106752 blocks [2/1] [_U] [>....................] recovery = 1.5% (7396736/487106752) finish=145.1min speed=55092K/sec unused devices: [none]Good... 2.5 hours later...~ # cat /proc/mdstat Personalities : [linear] [raid1] md0 : active raid1 sdb6[0] sda6[1] 487106752 blocks [2/2] [UU] unused devices: [none]Reboot again and we're done!

