NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
forty_green
Nov 06, 2014Aspirant
Rescue from failing drives #24146691
My 5-year old NV+ recently showed "Kernel panic" on the display and was inaccessible through Windows Explorer, Frontview etc. It did not respond to the power button but after pulling the plug I managed to reboot it. It resynced and I was then able to reach Frontview and run diagnostics. It passed all diagnostics except for the two original Seagate 1Tb drives, which are running in redundant X-RAID mode. The drives in bays 1 and 2 have 679 and 2127 reallocated sectors respectively, so I am assuming they are likely to fail imminently. Disk_smart.log reports drive 2 is "failing now" and "drive failure expected within 24 hours". So I have powered the ReadyNAS down and ordered 2x WD 2Tb red drives (as recommended elsewhere on the forum) to replace the Seagates, plus a 2Tb WD My Passport for USB back-up of the ReadyNAS, assuming I can rescue it.
My question is: What is the least risky sequence of rescue actions to protect my data? Plan A would be to pull drive 2 first and replace with a new drive, on the basis that drive 1 is the least sick so least likely to fail to resync. Once synching to the first new drive is complete, I would repeat the operation with drive 1, job done. Is it safer to do each swap in powered down state? Or, (Plan B), should I first plug the USB drive and back up everything to that - perhaps that operation is less stressful to the HDDs than resyncing? Or is there a better plan I haven't thought of? This is the first time I have attempted any drive swapping, so I'm stepping on eggshells. Obviously I am concerned that the whole setup might crash losing my data. Unfortunately, I have not been backing up the NAS, so have left my data rather vulnerable (lesson learnt!) Thanks for all and any advice (please assume minimal knowledge of ReadyNAS).
My question is: What is the least risky sequence of rescue actions to protect my data? Plan A would be to pull drive 2 first and replace with a new drive, on the basis that drive 1 is the least sick so least likely to fail to resync. Once synching to the first new drive is complete, I would repeat the operation with drive 1, job done. Is it safer to do each swap in powered down state? Or, (Plan B), should I first plug the USB drive and back up everything to that - perhaps that operation is less stressful to the HDDs than resyncing? Or is there a better plan I haven't thought of? This is the first time I have attempted any drive swapping, so I'm stepping on eggshells. Obviously I am concerned that the whole setup might crash losing my data. Unfortunately, I have not been backing up the NAS, so have left my data rather vulnerable (lesson learnt!) Thanks for all and any advice (please assume minimal knowledge of ReadyNAS).
12 Replies
Replies have been turned off for this discussion
- mdgm-ntgrNETGEAR Employee RetiredYou could try cloning the disk in bay 1 using dd_rescue. With the number of errors on disks 1 and 2 I don't think either disk is likely to survive a resync.
- forty_greenAspirantThanks for your rapid response. I searched dd_rescue and most articles are too technical for me. Sounds like it's an executable program. Can you tell me more explicitly what you have in mind? I'm not at all clear whether you propose I can do this in situ to a new drive in a spare bay in the NAS, or in a drive enclosure, etc. Also, another thread I just found (http://www.readynas.com/forum/viewtopic.php?f=65&t=75358) suggests I should back up to USB first anyway. Which would you do first - USB back up or dd_rescue? I won't respond again tonight - it's 23:10 UK time. Thanks again in advance.
- mdgm-ntgrNETGEAR Employee RetiredIf your data is important to you this is the kind of situation where I would consider the paid support options support offered you. You are precariously close to losing all your data.
You could try backing up to USB first (don't write to the NAS), but if the NAS hangs then it's likely that you will have no alternative but to clone one of the drives (probably disk 1).
We'd recommend replacing disks when the count exceeds 50. Both of your disks are way above that. Do you have email alerts setup? If so you should have been getting emails regarding at least one of the disks for a while one would think.readysecure1985 wrote: you could always use knoppix to clone the drive to a known good drive, and then place it back in the device. Keep in mind that the known good drive should be on the HCL. After successfully cloning with knoppix, you can then place the good drive in the NAS and power on. If all goes well, it will be as though the drive did not have any issues.
Here is a simple guide to quickly recover a failed drive using dd_rescue.
I often have to deal with pesky failed drives, so here is a quick simple guide how to achieve this with a free Linux Live CD and a PC with two SATA connections.
I will be using a Knoppix 6.2 Live CD for this guide. Can be found at http://www.knoppix.net
Using dd_rescue command allows you to copy data from one drive to another block for block. This is especially useful for recovering a failed drive. Often when a drive fails, the drive is still accessible, it has just surpassed the S.M.A.R.T. error threshold. dd_rescue allows you to ignore the bad sectors and continue cloning the bad drive to a new healthy drive.
1) Connect your old drive and new drive to your PC
2) Boot up using your Linux live CD
3) Launch a terminal window.
4) Run fdisk -l to make sure the system sees both of the hard drives.
5) Run hdparm -i /dev/sdx on both of the drives to find which drive is your source drive and which drive is your destination drive
6) Once you know which drive is which you can start the clone process.
dd_rescue /dev/sdx(source disk) /dev/sdx(destination drive)
7) You will see the process start, just keep an eye on it, it might take a few hours for the clone job to finish, depending on the size of the drive.
Once the process is complete, there will be no notification, the transfer will just stop and you will see the terminal prompt again.
If you see a lot of errors or see that there is no more data being shown as succxfer: it means the drive got marked faulty by the kernel. At this point reboot the system and make sure you know which drive is which again, as it is possible they lettering might switch. Run the dd-rescue command again but this time with -r option. This will start the cloning again but this time will start from the back of the drive and will make sure to get the data that has not been cloned yet.
Can you send your logs to me (see the link in my signature)? - forty_greenAspirant
I've PMd you with text pasted from what look to be the relevant logs. Also I attached a zip file of all logs, exactly as created by the diagnostics in Front View, as indicated in Useful Links. I'm just waiting for my new drives to be delivered by mail and the NAS remains powered down. Depending on your assessment of the logs, I'm minded to try the USB backup first. If that fails then I'll resort to dd_rescue as you recommend. I have just realised from another thread entitled "Will my NV+ need a factory reset" that I may need to reset too, because I'll be inserting 2Tb disks and don't think I have 4k sector alignment as start sectors are not divisible by 8, in spite of running RAIDiator 4.1.7. Is that correct?
Disk /dev/hdc: 999.9 GB, 999991611392 bytes
255 heads, 63 sectors/track, 121575 cylinders, total 1953108616 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/hdc1 2 4096001 2048000 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/hdc2 4096002 4608001 256000 82 Linux swap / Solaris
Partition 2 does not end on cylinder boundary.
/dev/hdc3 4608002 1953092233 974242116 5 Extended
/dev/hdc5 4608003 1953092233 974242115+ 8e Linux LVM
Disk /dev/hde: 999.9 GB, 999991611392 bytes
255 heads, 63 sectors/track, 121575 cylinders, total 1953108616 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x00000000
If so, it seems the strategy would then be to restore from USB backup to the two brand new drives and forget about synching from the old failing ones. Thanks again - forty_greenAspirantYour other question: yes, I did have email alerts set up. But they stopped and I forgot to investigate why. I'll reinstate them as a priority as part of the recovery process. There were messages almost from new about my Seagate Barracuda drives, so perhaps I should be surprised they've survived 5 years. I hope the WD Reds will be more durable and will check out how to optimise them for NAS.
- mdgm-ntgrNETGEAR Employee RetiredAh, so you have just the two disks in the NAS.
Yes, you don't have 4k sector partition alignment. A factory reset (wipes all data, settings, everything) on 4.1.7 or later is needed to get 4k sector partition alignment. Clearly you last did a factory reset on older firmware (also your logs back a long time before 4.1.7 was released and I see in the logs an update to 4.1.5 so you last did a factory default on 4.1.4 or earlier).
Your logs suggest that disk 2 was a disk that needed replacing as far back as May 2010. The first reallocated sectors for the disk were back in September 2009.
Disk 1 would have needed replacing from July 2014 with the first reallocated sector occurring back in March 2014.
Some ISPs block the internal email server of the ReadyNAS so you may need to configure the email alerts to use e.g. a Gmail account. - forty_greenAspirantJust wanted to thank you (mdgm) for your really constructive support from Down Under. I now have a fit and healthy NV+ (Sparc) once again. The steps I took in the end were as follows:
1. Split the default NTFS partition on brand new 2Tb USB drive (WD My Passport) while mounted from PC/Win7 into two volumes: one approx. 800Gb ExtFAT for direct access from PC and the rest left as NTFS for dedicated ReadyNAS backup.
2. Attached the USB drive to NV+ and formatted the NTFS partition as EXT3.
3. Changed security mode from "share" to "user" in order to be able to access the USB drive EXT3 partition (this was an unexpected glitch requiring a search of this forum to resolve)
4. Created, ran and verified backup jobs for each share on NV+. Great relief that dodgy HDDs survived this operation.
5. Updated firmware to 4.1.14 to allow 4k sector partition alignment of new HDDs and and saved configuration.
6. Powered off the NV+, removed the 2 suspect HDDs, blew out the dust and replaced with 2 new WD Red 2Tb drives.
7. Powered NV+ on and followed basic X-RAID setup.
8. Ran Factory Default from FrontView. Interestingly, device retained its static IP address on my home network - nice.
9. Restored backups from USB drive to NV+ and verified OK.
10. Installed Paragon ExtFS on PC to allow viewing the EXT3 backup volume.
So at this stage I have a fully operational system and no data loss - wonderful. But I'm having trouble with step 11 - setting up email alerts. Using the same settings for my email account that work fine in various email clients, I just get a "Failed to reach SMTP server" message. Could this be a problem with my antivirus program/firewall settings and can mdgm or anyone suggest a workaround? Are there any specifics to describe the traffic that the NV+ attempts to send in case I need to describe the problem to my ISP or mail hosting company?
Thanks again for support with 1-9, the icing on the cake would be help with 11. Cheers. - mdgm-ntgrNETGEAR Employee Retired8. The NAS picks up an I.P. address via DHCP after a factory reset. If you assigned a static I.P. for the NAS on your DHCP server (e.g. your router) then the NAS would pick me up.
Is it e.g. Gmail? if you have two factor authentication setup you need to setup an application specific password for the NAS and allow less secure apps to connect. - forty_greenAspirantI have various accounts, all of which are POP3 not IMAP. The simplest I tried is BT Internet (incoming and outgoing server mail.btinternet.com, logon just requires user name and password but not secure password authentication. Outgoing SMTP server requires same password authentication as incoming. Incoming POP3 server uses port 110, outgoing uses 25. I've also tried to replicate settings for other accounts I have including one that is hosted using my domain name. I don't use Gmail. I seem to have exhaustively tried all combinations of the mail alert settings in FrontView. As we already discussed, once upon a time I was receiving email alerts but they ceased. This may have coincided with when I switched ISPs from O2 to BT, switched routers or switched firewalls, but I don't know so I am guessing at causes. For example, is the message transfer agent that sends alerts a specific executable program that I need to grant my firewall permission to access the internet?
- mdgm-ntgrNETGEAR Employee RetiredI don't think emails would be blocked by the firewall and I don't think they would be blocked by an ISP provided they are sent via a method that requires authentication.
Have you checked your Spam/Junk Mail folder.
If all else fails you could try e.g. setting up a Gmail account and getting that to forward the emails on.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!