Reply

Failed expansion + dead disk = lost data

LauraWerner
Aspirant

Failed expansion + dead disk = lost data

A few weeks ago I replaced one of the disks in my ReadyNAS NVX with a larger one. The expansion process seemed to complete successfully: the NAS was working fine and wasn't complaining that it was in a degraded mode.

Fast-forward to today. This morning one of my disks went bad during the weekly volume maintenance, maybe due to a brief power outage yesterday. (I have a UPS, and it appeared to ride out the power outage just fine, but there may have been a surge that it didn't deal with properly.) The NAS appeared to just be in a weird state, so I powered it off cleanly, restarted it, and told it to do another scan.

Now the NAS is telling me that disk #1 (the one I replaced earlier this month) is "spare", not part of the RAID array. It says disk #3 -- the one that appeared to fail this morning -- is just gone. Since there are two "failed" disks, the array is "dead' and my data is gone.

Is there anything I can do at this point? Or any logs I can provide here that would help people know what might have happened? None of the data is irreplaceable, but I'd prefer to get the RAID array working again (with the failed disk replaced) rather than rebuilding it from scratch and reloading all the data.

Thanks!
Message 1 of 8
mdgm-ntgr
NETGEAR Employee Retired

Re: Failed expansion + dead disk = lost data

You could try cloning the failed disk. You may also wish to seek advice from NetGear tech support (see the Online Submission link in my sig).

readysecure1985 wrote:
you could always use knoppix to clone the drive to a known good drive, and then place it back in the device. Keep in mind that the known good drive should be on the HCL. After successfully cloning with knoppix, you can then place the good drive in the NAS and power on. If all goes well, it will be as though the drive did not have any issues.

Here is a simple guide to quickly recover a failed drive using dd_rescue.

I often have to deal with pesky failed drives, so here is a quick simple guide how to achieve this with a free Linux Live CD and a PC with two SATA connections.
I will be using a Knoppix 6.2 Live CD for this guide. Can be found at http://www.knoppix.net
Using dd_rescue command allows you to copy data from one drive to another block for block. This is especially useful for recovering a failed drive. Often when a drive fails, the drive is still accessible, it has just surpassed the S.M.A.R.T. error threshold. dd_rescue allows you to ignore the bad sectors and continue cloning the bad drive to a new healthy drive.

1) Connect your old drive and new drive to your PC
2) Boot up using your Linux live CD
3) Launch a terminal window.
4) Run fdisk -l to make sure the system sees both of the hard drives.
5) Run hdparm -i /dev/sdx on both of the drives to find which drive is your source drive and which drive is your destination drive
6) Once you know which drive is which you can start the clone process.

dd_rescue /dev/sdx(source disk) /dev/sdx(destination drive)
7) You will see the process start, just keep an eye on it, it might take a few hours for the clone job to finish, depending on the size of the drive.

Once the process is complete, there will be no notification, the transfer will just stop and you will see the terminal prompt again.

If you see a lot of errors or see that there is no more data being shown as succxfer: it means the drive got marked faulty by the kernel. At this point reboot the system and make sure you know which drive is which again, as it is possible they lettering might switch. Run the dd-rescue command again but this time with -r option. This will start the cloning again but this time will start from the back of the drive and will make sure to get the data that has not been cloned yet.
Message 2 of 8
LauraWerner
Aspirant

Re: Failed expansion + dead disk = lost data

Thanks. I'll give the disk cloning with Knoppix a try, and if that fails I'll try Netgear support. (Though I don't have a support contract and the NAS is old enough that it's not under any sort of support warranty any more.)

Assuming I do get the disk cloned and the array working again, does anyone have tips for getting it into a safe, replicated state, and making _sure_ it's safe? I thought I had it in a safe state before, but I was obviously wrong.

Laura
Message 3 of 8
mdgm-ntgr
NETGEAR Employee Retired

Re: Failed expansion + dead disk = lost data

Note that after cloning the disk, some of your data may be corrupt and unrecoverable (not all sectors on the disk may be readable).

Well after a resync completes you could check the state in Frontview. It is important to make sure you keep a backup just in case you encounter a problem that RAID can't protect you against.
Message 4 of 8
LauraWerner
Aspirant

Re: Failed expansion + dead disk = lost data

Since you seem knowledgable, here's a semi-related question for you or anyone else out there. Once I get the NAS recovered (or reinitialized and reloaded), what's a good poor-man's backup solution for the data on it? My current thought is to do something like:
- Get a cheap external RAID enclosure that connects with USB.
- Configure the external enclosure to be one single volume with no redundancy, since that would be, um, redundant.
- Set up the ReadyNAS to back up to the external raid box, probably just when I trigger it manually so I don't need to leave it on all the time.
- If any drives in the external enclosure fail, replace them, reformat the whole thing, and do a fresh, complete backup to it.

Does that make sense? If anyone has a setup like this, which one of the external raid boxes are you using and is it any good?

Thanks
Message 5 of 8
StephenB
Guru

Re: Failed expansion + dead disk = lost data

You don't want RAID-0, because a failure of any backup disk would result in loss of the entire volume.

Personally I use another NAS. You could use an RN102 with 2x3TB in jbod (giving up redundancy). Or go with 2x4TB and get 8 TB of backup space.
The RN102 costs ~$200 (actually tigerdirect has on on sale at the moment for $160 in the US). That is a bit more than the cheapest USB enclosures, but it is a bit safer also (since the backup isn't directly connected to the main NAS at all).

A similar notion is to install internal (or even external) drives in a desktop PC and back up to them. If your PC has 2 empty disk bays you can get 6-8 TB of backup space w/o any form of new enclosure.

Though if you have a suitable internet connection, the cheapest backup is crashplan. $60 a year is cheaper than the disk costs, not even counting the enclosure. Backup/restore speed is an of course much slower than local backup, and you are trusting someone else to protect your data. But you can' beat the price... I use it for disaster recovery.
Message 6 of 8
LauraWerner
Aspirant

Re: Failed expansion + dead disk = lost data

I didn't realize the new RN10* boxes were so cheap. Nice. How do the performance and features of the RN102 and RN104 compare to my old NVX? If I need to completely rebuild my NAS, which is looking increasingly likely*, I'm wondering whether I should relegate the old NVX to backup duty and buy a new 104 for my main NAS. The new OS looks nice in the reviews I've read, especially all the btrfs-based snapshot stuff. But I couldn't find any reviews with direct comparisons on performance, reliability, etc. Any opinions?

If I go with crashplan, can anyone confirm that the crashplan linux client works nicely on the NVX? From surfing the forums it looks like it works on all the x86 boxes, so I doubt there would be a problem, but it would be good to confirm.

* The "looking increasingly likely" part is because I'm having no luck recovering my dead disk with dd_rescue. It copies about 30 Gb (out of 3 Tb) and then bogs down. Oddly, the same thing happened when I tried copying it back to front with "-r". After a reboot I'm having trouble even getting the PC to POST with the bad disk in it, so I think it may be complete toast.

I think my only hope at this point is to figure out whether there's a way to make the NVX recognize the larger disk I'd swapped into the unit last month, the one it's now claiming is "spare" rather than part of the raid array. That's probably a question for Netgear support unless someone here has tips.

Laura
Message 7 of 8
StephenB
Guru

Re: Failed expansion + dead disk = lost data

There is a long thread on CrashPlan here: https://www.readynas.com/forum/viewtopi ... &hilit=nvx At least one poster there says it worked on his NVX.

On the RN104, it is slower than an ultra or pro, but faster than the old NV+ / Duo sparc products. Maybe someone with both an RN104 and an NVX can comment on how they compare. The RN314 performance would be closer to your NVX, though it is quite a bit more expensive. You'd need an RN3xx or RN5xx to host crashplan - the RN1xx doesn't have enough memory. The higher-end units also support a new expansion box.

All the OS6 platforms support the same features - there is no business/consumer feature disparity like the old lines. Almost of the x86 features are there. One that isn't (which I use) is the ability to send a WoL packet from a backup job. The statistics on the NVX are also better (network stats, plus being able to see all the SMART stats from the GUI). Of course OS6 is new; they are adding some features in every update.
Message 8 of 8
Top Contributors
Discussion stats
  • 7 replies
  • 1489 views
  • 0 kudos
  • 3 in conversation
Announcements