NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

ivor999's avatar
ivor999
Aspirant
Jun 21, 2014

status: inactive - array degraded - disk sync failure

Hello forum:

Looking for help /advice on how to or (whether I can ) preserve my data using the NAS
I have a Ready NAS Duo v2 with 2 3TB Seagate hard drives in it. Configured as X-RAID I believe.
Disk 1 replaced a 3TB in the X-RAID drive that failed recently BUT doesn't look like sync completed successfully. Perhaps because disk 2 is failing and reporting uncorrectable errors. Disk 1 status is showing as inactive: spare
And the RAID volume displays a warning that redundancy is lost and a failed disk will mean a dead array.

Is there any way to force a 'successful' sync completion from disk 2 to disk 1 of all the data that can be read successfully?

I have shut down the system until I know what action to take since it is probably a matter of days (I hope not hours ) until disk 2 simply dies on me. Although I have several computers I think only 1 will support > 2TB drives in bios and that one has all sata channels in use. So it is not easy to take the 3TB drive and format in another machine and backup the RAID system(degraded) data to that until the disk 2 fails and I get a replacement in for that too. If I have to I can try this route. It would be much better if I could get the best sync possible in the NAS.

I started a support case with Netgear about this and got a fast initial response but it did not immediately seem to answer my technical concerns. The case number is 23399455 and I will copy that dialog below here as it has I think full details. Although my RAID is shutdown I will have to act fast when I power on again. Any tips much appreciated.
Reading through this forum I have seen some suggestions on identifying the sectors with errors on the failing disk and zeroing them out. But am unsure where these tools would be. Ah is it by ssh into the device and using linux commandline? As stated any advice much appreciated.

Here is the case note copy:

Hello:

I had 2 seagate 3tb drives in my readynas duo system
1 drive – disk 1 - failed. The other drive – disk 2 – reported increasing uncorrectable errors and impending failure.
I took the system offline for 20 – 30 days while I processed a warranty replacement for disk 1.
I realize I will need to do the same for disk 2 shortly once it fails completely. However I want to be sure that all the data (or all that is readable) has been copied to the new replacement for disk 1 drive.


A refurbished Seagate 3 tb drive arrived yesterday and I inserted it into the system as disk 1 and powered up
I expected the new drive to be sync’d to existing drive data and sync messages DID display
However the status for new disk 1 is SPARE INACTIVE not status OK

The logs show the following

Sat Jun 21 16:45:28 WEST 2014 Detected increasing ATA errors on disk 2[ST3000DM001-1CH166, Z1F246BN] 37 times in the past 30 days. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.
Sat Jun 21 14:58:55 WEST 2014 RAID sync finished on volume C. The array is still in degraded mode, however. This can be caused by a disk sync failure or failed disks in a multi-parity disk array.
Sat Jun 21 14:58:18 WEST 2014 Detected increasing uncorrectable errors[24] on disk 2 [ST3000DM001-1CH166, Z1F246BN] in the past 30 days. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.
Sat Jun 21 08:43:12 WEST 2014 System is up.
Sat Jun 21 08:42:23 WEST 2014 Data volume will be rebuilt with disk 1.
Mon May 26 14:46:20 WEST 2014 Detected increasing uncorrectable errors[24] on disk 2 [ST3000DM001-1CH166, Z1F246BN] in the past 30 days. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.


My questions are simple.
What is the meaning of status: spare inactive ?
Has all my data (that is readable) been copied from disk 2 to disk 1?
If I take out disk 2 and process a return with Seagate and insert the new disk back in the system will it rejoin the RAID and sync with disk 1?

Is there any action I should take to force best case sync and bring disk 1 to status OK before disk 1 fails on me completely
Thanks for any help or pointers to docs.
Ivor

================================== response from Netgear Joven (fast within 24 hours I think)

Mr. Neckles,

Inactive spare simply indicates that the disk had fall off from the RAID. Possibly Disk 1 had not totally sync with the existing disk. I would like to verify if you still have access to your data on the current setup you have now?

For more information on your query, NETGEAR FAQs are available at:

http://support.netgear.com


Sincerely,

Joven Domopoy
Level 2 Support Expert
NETGEAR, Inc.
http://support.netgear.com

===================================================== my response to this

Thanks for your response Joven:

1) I do have access to the data for the moment BUT there is a warning that there is no redundancy and a disk failure will result in a dead array and no access to data.
2) There is also the warning that disk 2 is about to fail. And when it does I WILL lose access to all data IF disk 1 has not sync'd
3) I am asking what disk 1 -- status: inactive means.
4) and what is the meaning of the log message saying [RAID sync finished on volume C. The array is still in degraded mode, however. This can be caused by a disk sync failure or failed disks....]

It looks like the disk sync to disk 1 has NOT completed. Possibly due to uncorrectable errors on the failing disk 2.
How can I force the sync to complete on disk 1 with as much data as it can successfully read off disk 2 --- so that when disk 2 fails ---disk 1 can continue to serve as much data as it could sync.

Thanks for your time
Regards
Ivor

5 Replies

  • StephenB's avatar
    StephenB
    Guru - Experienced User
    Do you recall the SMART stats for drive 2? I am wondering if you still have only one uncorrectable error.

    While you are waiting for support I would try running the seatools diag on disk 2, and perhaps check the SMART stats on the windows PC when it finished. (Acronis disk monitor is one of several free tools that will show them to you). Perhaps also get a USB enclosure or adapter kit, to make it easier to test the disks. that would also let you make a backup. Something like this perhaps: http://www.amazon.com/SABRENT-5-25-INCH ... ta+adapter
  • StephenB wrote:
    Do you recall the SMART stats for drive 2? I am wondering if you still have only one uncorrectable error.

    While you are waiting for support I would try running the seatools diag on disk 2, and perhaps check the SMART stats on the windows PC when it finished. (Acronis disk monitor is one of several free tools that will show them to you). Perhaps also get a USB enclosure or adapter kit, to make it easier to test the disks. that would also let you make a backup. Something like this perhaps: http://www.amazon.com/SABRENT-5-25-INCH-Converter-Activity-USB-DSC8/dp/B008S08D9E/ref=sr_1_5?s=electronics&ie=UTF8&qid=1403434282&sr=1-5&keywords=sabrent+usb+3+to+sata+adapter


    Thanks for the response.

    I have not yet seen SMART stats for the drive. However the log implies 24 uncorrectable errors I believe.

    My biggest challenge is that these drives are 3TB and have read a bit about needing a certain kind of support in the BIOS (EFI/UEFI?) to be seen fully correctly. Apart from my NAS devices I have only 1 PC whose BIOS might have the EFI/UEFI and that is my main machine with 6 sata channel fully used. Had I had a spare SATA channel on this machine and IF the BIOS already had EFI/UEFI enabled then I would already have tested the drives on this box. However I would have to be desperate before I would risk changing anything in the configuration of this my main machine.

    It is possible that the 3TB drives could be connected in one of my older machines and only 2TB recognized. If this will allow me to see SMART stats I may test it. However my main goal would be to check the integrity of data or use the drive as a target for backup so just getting the SMART data doesn't really help. I have no doubt the drive reporting failures is really dying. I already had one report these errors and then die in the NAS device and then be accepted by Seagate for warranty replacement.

    Thanks for the suggestion of the SATA-USB adaptor kit or enclosure. I could not see from its specs whether it supports >2TB disks or not. The last enclosure I bought has a max of 2TB as I am sure do my existing adaptors. My 'budget controller' will not be allowing me to buy any gadgets to help fix this though :wink:

    I am aware of Acronis and Seatools and often use rescue CDs like Ultimate Boot or Hiren. Problem is a safe test bed.

    I do have an external 1TB drive spare and another 1TB I can free up and a couple of other drives with 500GB or so spare. So in the worst case I can boot the READYNASsystem as is and try and copy off all the data onto the distributed spare space I have. Hope it stays up long enough for me to do that. Then once I have backed everything up either wait until the drive fails or put some process in place to exercise it and drive it faster to failure (if that would work) Then see what the NAS does with the inactive: spare. Whether or not it has my data on once it becomes a primary active drive in the RAID I can reload backups to it. This will be time-consuming and a royal pain but is my fall back low tech solution.

    What would be better and what I wonder if any one here knows is
    assume I ssh into the NAS device
    and my target is to use linux commands on the RAID members to force a full sync
    a) can the NAS support this
    b) what commands do I need to use

    The NAS itself has allowed me to schedule a volume check on next reboot.

    But I saw somewhere in these forums about using I guess linux commands (? dd ?) to
    c)) identify the unreadable sectors on raid member disk 2 and
    d) write zeros into these sectors forcing the drive to remap the bad sectors to good sectors
    c) force the re-sync which succeeds because the problem sectors have been zero-filled??

    Any files that contain the zero--filled unreadable sectors will be damaged
    I may not have understood this recovery process completely correctly

    But unless the ReadyNAS software itself has tools to help with something like this I am pretty much on my own.

    Saw this link elsewhere in this forum probably
    http://linoxide.com/linux-how-to/how-to-fix-repair-bad-blocks-in-linux/
    which explains how to use linux command line to deal with hard drive recovery.

    doesn't give info on whether
    smartmontools
    is built into the ReadyNAS linux distribution
    or what equivalent might be available

    nor does it give info on what command(s) needed to shut down or pause NAS services while the disks are being worked on.

    finally what ?NETGEAR package ? handles X-RAID and is there a command to force/trigger re-sync after trying above sector rescue
    Or can the ReadyNAS admin interface be used to trigger the re-sync

    Thanks again for the suggestions so far.
    Ivor
  • StephenB's avatar
    StephenB
    Guru - Experienced User
    ivor999 wrote:
    I have not yet seen SMART stats for the drive. However the log implies 24 uncorrectable errors I believe.
    That is quite a few, and personally I'd replace a disk with that many. Though knowing if there are also pending sector or reallocated sector counts would be useful.

    On the 3 TB issue, you can purchase a relatively inexpensive USB 3.0 to SATA converter that would let you check out one drive on the newer PC.
  • StephenB wrote:
    ivor999 wrote:
    I have not yet seen SMART stats for the drive. However the log implies 24 uncorrectable errors I believe.
    That is quite a few, and personally I'd replace a disk with that many. Though knowing if there are also pending sector or reallocated sector counts would be useful.

    On the 3 TB issue, you can purchase a relatively inexpensive USB 3.0 to SATA converter that would let you check out one drive on the newer PC.



    Thanks StephenB
  • Thanks for the help.

    Situation now resolved and am posting what worked for me in case useful for anyone else.

    The key was to schedule the volume scan check or something similar on restart.
    Cant find where I saw it but might have been on the restart interface itself

    Anyway after scheduling this check when I booted up the system to start backing data off it I found that fsck/efsck had run and fixed a load of errors on the problem drive
    and this had allowed the sync to complete successfully!

    So instead of a degraded array with one drive as status: spare
    I now had a redundant array with 2 drives showing status: OK !!!!
    Great!!!

    Am running a backup of key information from the drive anyway as I am sure the problem drive is going to fail on me.
    The sooner the better as once it fails completely I can get it replaced by Seagate.
    Theoretically I should be entitled to get it replaced now but would have to demonstrate with Seagate tools or similar that the drive was beyond recovery. Simplest to wait until it dies so there is no argument.

    So to summarize:
    One of 2 drives in my array failed and the remaining drive started to show errors
    Adding a new good drive to replace the completely failed drive did not obtain a successful sync and redundant array
    the solution for me was to restart systems and schedule a volume scan on restart when requesting restart
    this reported fixing loads of errors on the failing drive
    and the good drive was able to complete sync with failing drive with its errors fixed



    Thanks again for the help.
    Hope this is useful to someone else

    Regards
    Ivor

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More