NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

keenan1's avatar
keenan1
Aspirant
Dec 09, 2011

Unable to access the data volume after disk replacement

Summary:
After a hard drive failure and hot-swap, all volumes have disappeared. I'm not sure but believe this was caused by some Smart Errors on a second drive while the replacement drive was being resynced. Please guide me on how to recover my data.

Details:
I have a RedayNAS Pro Business Edition with 3 Seagate ST31000340NS disks, all in the top row with the bottom row empty. The middle drive failed right after a period of erratic electricity during a storm, and I got a new drive from Seagate and hot-swapped it into the NAS. Synchronization was proceeding when I went to bed, but in the morning the system appeared hung and wouldn't serve out files or serve the web admin pages. I rebooted to get messages about the volumes not being mounted:
The paths for the shares listed below could not be found. Typically, this occurs when the ReadyNAS is unable to access the data volume. media private backup

I powered off, physically reset the drives in the rack and rebooted to the same errors. Looking back through the logs shows another disk having problems during the resync:
ATA error count has increased in the last day. Disk 3: Previous count: 10 Current count: 17 Growing SMART errors indicate a disk that may fail soon. If the errors continue to increase, you should be prepared to replace the disk.

I've filed a support request with netgear (case #17340895) including the "all logs" incase any insiders want to look at them. Here's an interesting excerpt from dmesg:
    md: bind<sdb5>
md: bind<sdc5>
md: bind<sda5>
md: kicking non-fresh sdc5 from array!
md: unbind<sdc5>
md: export_rdev(sdc5)
md/raid:md2: device sda5 operational as raid disk 0
md/raid:md2: allocated 3222kB
md/raid:md2: not enough operational devices (2/3 failed)
RAID conf printout:
--- level:5 rd:3 wd:1
disk 0, o:1, dev:sda5
disk 1, o:1, dev:sdb5
md/raid:md2: failed to run raid set.
md: pers->run() failed ...
md: md2 stopped.
md: unbind<sda5>
md: export_rdev(sda5)
md: unbind<sdb5>
md: export_rdev(sdb5)
md: bind<sdb5>
md: bind<sdc5>
md: bind<sda5>
md: kicking non-fresh sdc5 from array!
md: unbind<sdc5>
md: export_rdev(sdc5)
md/raid:md2: device sda5 operational as raid disk 0
md/raid:md2: allocated 3222kB
md/raid:md2: not enough operational devices (2/3 failed)
RAID conf printout:
--- level:5 rd:3 wd:1
disk 0, o:1, dev:sda5
disk 1, o:1, dev:sdb5
md/raid:md2: failed to run raid set.
md: pers->run() failed ...
md: md2 stopped.
md: unbind<sda5>
md: export_rdev(sda5)
md: unbind<sdb5>
md: export_rdev(sdb5)


And here's an mdadm --examine run:
    BigBoy:~# mdadm --examine /dev/sd{a,b,c}5
/dev/sda5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a6853def:797f9ef9:ca681632:da33b4ee
Name : 001F33EA1505:2
Creation Time : Wed Mar 25 12:56:39 2009
Raid Level : raid5
Raid Devices : 3

Avail Dev Size : 1944082604 (927.01 GiB 995.37 GB)
Array Size : 3888165184 (1854.02 GiB 1990.74 GB)
Used Dev Size : 1944082592 (927.01 GiB 995.37 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : clean
Device UUID : edf65da2:f26cd2b1:934dde1c:a5630a71

Update Time : Tue Nov 29 01:03:38 2011
Checksum : d12f70cf - correct
Events : 5327

Layout : left-symmetric
Chunk Size : 16K

Device Role : Active device 0
Array State : AA. ('A' == active, '.' == missing)
/dev/sdb5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x2
Array UUID : a6853def:797f9ef9:ca681632:da33b4ee
Name : 001F33EA1505:2
Creation Time : Wed Mar 25 12:56:39 2009
Raid Level : raid5
Raid Devices : 3

Avail Dev Size : 1944082596 (927.01 GiB 995.37 GB)
Array Size : 3888165184 (1854.02 GiB 1990.74 GB)
Used Dev Size : 1944082592 (927.01 GiB 995.37 GB)
Data Offset : 280 sectors
Super Offset : 8 sectors
Recovery Offset : 1834784256 sectors
State : clean
Device UUID : 1109e61e:923e5f5b:99156607:aabeef37

Update Time : Tue Nov 29 01:03:38 2011
Checksum : 160e0da4 - correct
Events : 5327

Layout : left-symmetric
Chunk Size : 16K

Device Role : Active device 1
Array State : AA. ('A' == active, '.' == missing)
/dev/sdc5:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a6853def:797f9ef9:ca681632:da33b4ee
Name : 001F33EA1505:2
Creation Time : Wed Mar 25 12:56:39 2009
Raid Level : raid5
Raid Devices : 3

Avail Dev Size : 1944082604 (927.01 GiB 995.37 GB)
Array Size : 3888165184 (1854.02 GiB 1990.74 GB)
Used Dev Size : 1944082592 (927.01 GiB 995.37 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 79947379:b8846910:703730f6:78380309

Update Time : Tue Nov 29 01:01:06 2011
Checksum : 782ae438 - correct
Events : 5324

Layout : left-symmetric
Chunk Size : 16K

Device Role : Active device 2
Array State : AAA ('A' == active, '.' == missing)
BigBoy:~#


Any help would be appreciated. Thanks,
--keenan

Notes:
    I've done no writes to the NAS since attempting to resync with the new drive.
    I made a complete backup of the NAS when the disk first failed, so could do a full reload if necessary but would prefer to recover the data as there were a few minor changes subsequent to the backup.
    I have root access to the NAS, based on one of the Jedi's (Chirpa?) directions back when I was working with support to analyze some SlimServer problems a few years ago.
    I've read various web pages, like
    http://www.mysqlperformanceblog.com/201 ... id5-array/
    http://www.linuxquestions.org/questions ... ay-416853/
    and feel competent to attempt recovery *under your direction*.
    The battery on my UPS failed before the storm, so you'll see lots of those messages in the log. The battery has now been replaced.
    The drive in the 3rd slot was replaced about a year ago after it failed, and the resync went fine at that point.

2 Replies

Replies have been turned off for this discussion
  • mdgm-ntgr's avatar
    mdgm-ntgr
    NETGEAR Employee Retired
    When a disk is added (e.g. to replace a failed disk) a resync occurs. A resync puts heavy stress on all disks. So if a second disk is failing the resync can finish it off. I would suggest you work with NetGear tech support on this issue. It's good practice to include the case number in the title of the thread (i.e. subject of first post in the thread).
  • Kudos on the backup. Although at times a major PITA to do, and another one if you have to recover your data from the backup, but at least you will have data to recover. This situation, hopefully will be resolved by tech support, but in case it doesn't, this is the reason we promote the creation and maintenance of a backup.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More