× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive

mattmarlowe
Guide

Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive

Yesterday evening, one drive failed on my 6 drive RN51600 in XRAID Raid 5.   I had a spare on hand, and immediately put in a replacement drive and a rebuild started.   This morning, when I came in...the NAS was back in degraded status.  Looking at the logs, a different drive failed during the rebuild. 

 

Luckily, the NAS is backed up to readycloud...at least 99% of the files are there.  The only issues with the latest backups is for some reason, readycloud doesn't like a lot of linux text files on a unix nfs share.  It complains it can't determine the type of the file.

 

Strangely enough, the NAS still seems to be working...but I doubt I can trust any data on it at this point.  I also replaced the 2nd failed disk and it says it is rebuilding, but I doubt that will mean anything. 

 

Is the best step at this point simply to delete the current volume and restore from backup?

 

Model: RN51600|ReadyNAS 516 6-Bay
Message 1 of 9

Accepted Solutions
mattmarlowe
Guide

Re: Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive

Everything has turned out all right.  The NAS was backed up every night to Netgear's ReadyVault which actually had support for backup snapshots so I have plenty choices of how to recover.  Most of the critical data has already been restored.

 

I had planned to restructure the unit in RAID10 next year anyway, so the failure just accelerated my plans by a few months.....If anything, it has made me reconsider RAID6 and which drives I put in the unit.

 

I'm not a big fan of RAID6 or WD Red Drives -- most of my datacenter deployments are RAID10 across 10-20 15K RPM SAS disks.  But, for the small office sensitive to noise and carying more about energy efficiency than performance,which the ReadyNAS 516 is designed for -- it looks like RAID6 w/ WD RED 5900RPM drives is the trusted safe approach.

 

Going forward, I think the next time we upgrade the NAS I'll get the 8 drive 628x model....for random smile read/write ops, spindle count will be more important than RPM or RAID level and that is really the major area of performance concern w/ these units.  It would be nice if Netgear also sold them w/ more ECC RAM - It can't be too expensive to put a few more 8GB ECC dimms in each chassis. 

View solution in original post

Message 8 of 9

All Replies
mdgm-ntgr
NETGEAR Employee Retired

Re: Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive

If you are going to start over it would be much better to do a factory reset to get the system back to a clean state than to simply delete the volume. A factory reset would give you a clean setup on the firmware currently on the flash.

Message 2 of 9
mattmarlowe
Guide

Re: Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive

Good info -- whats confusing is that the NAS still thinks it can recover, it's doing a resync/rebuild right now after I put in a replacement for the failed 2nd drive, I'm not sure how that is possible.

 

If a 2nd drive in an xraid raid5 volume fails while the volume is in the process of rebuild from degraded status, shouldn't the volume switch to failed status?

 

Instead, the NAS says it is in degraded mode again and has started another rebuild.

 

In any case, yes, I'm thinking of resetting to factory defaults and then setting up a new volume.  I've just ordered 3 new drives...the most that I can justify at the moment so that we can have newer drives in a new volume.

 

Old volume had 5 Seagate 4TB Enterprise and 1 WDC 4TB gold drive which wasn't optimum because the fans ran faster to keep the WDC gold drive cool while the Seagate drives ran hot.  Going forward, I don't see any point of getting any better drive than the WDC RED 8TB which seems to have a great cache/accoustics/operating temp combination at the slight cost of only a 3yr rather than 5yr warranty.  Generally, we're removing seagate drives everywhere here because while they were good for decades, something has been wrong for the last 2yrs there.  WDC has had quite a turn around.  I'll probably be putting gold drives in all servers and red drives in NAS's.

Message 3 of 9
StephenB
Guru

Re: Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive


@mattmarlowe wrote:

Going forward, I don't see any point of getting any better drive than the WDC RED 8TB which seems to have a great cache/accoustics/operating temp combination at the slight cost of only a 3yr rather than 5yr warranty.  


I use the Reds myself (6 TB model in my main RN526x NAS, 8 TB in my RN524x backup), and have had good luck with them.

 

Though I do suggest enterprise-class drives for NAS with more than 8 bays.

Message 4 of 9
TeknoJnky
Hero

Re: Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive


@mattmarlowe wrote:

Good info -- whats confusing is that the NAS still thinks it can recover, it's doing a resync/rebuild right now after I put in a replacement for the failed 2nd drive, I'm not sure how that is possible.

 

If a 2nd drive in an xraid raid5 volume fails while the volume is in the process of rebuild from degraded status, shouldn't the volume switch to failed status?

 

Instead, the NAS says it is in degraded mode again and has started another rebuild.

 


 

Well yes  possibly the nas finished the resync then the other drive failed ?

In the system logs, does it indicate that data resync completed before the 2nd drive failure ?

 

Message 5 of 9
mattmarlowe
Guide

Re: Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive

No, I carefully reviewed the logs, and the timeline was rather clear:

 

Time t - First Disk Initially Fails

Time t+3hrs - I notice what happened and put a new drive in, rebuild time is estimated somewhere between 3-6hrs

Time t+4.5hrs - Second disk fails, no evidence at all that first disk rebuild had finished, system had never returned out of degraded mode

Time t+8hrs - I notice whats going on, and see the NAS in degraded mode, replace the 2nd failed drive and see the NAS is in rebuild mode, make my first post in this thread.

Time t+11hrs - first complaints from others that some of the drives in the NAS are corrupted, see other strange behavior, assume that the NAS has no idea what it is doing, begin factory reset of device and creation of new volume

 

Message 6 of 9
mdgm-ntgr
NETGEAR Employee Retired

Re: Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive

If you wanted data recovery attempted you shouldn't have replaced the 2nd failed disk. As RAID-5 protects against a single disk failure after a second disk failure it's important to proceed very cautiously.

 

RAID can't protect against too many disk failures which is one reason why backups are so important.

 

Some feel the risk of dual disk failures is too great for them to continue to use RAID-5.

 

Personally I disabled X-RAID, destroyed the default volume, created a new RAID-6 one and re-enabled X-RAID in my NAS. I like the additional peace of mind from having some protection against dual disk failures, but still backup. My data is important to me so I don't store it on just the one device.

Message 7 of 9
mattmarlowe
Guide

Re: Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive

Everything has turned out all right.  The NAS was backed up every night to Netgear's ReadyVault which actually had support for backup snapshots so I have plenty choices of how to recover.  Most of the critical data has already been restored.

 

I had planned to restructure the unit in RAID10 next year anyway, so the failure just accelerated my plans by a few months.....If anything, it has made me reconsider RAID6 and which drives I put in the unit.

 

I'm not a big fan of RAID6 or WD Red Drives -- most of my datacenter deployments are RAID10 across 10-20 15K RPM SAS disks.  But, for the small office sensitive to noise and carying more about energy efficiency than performance,which the ReadyNAS 516 is designed for -- it looks like RAID6 w/ WD RED 5900RPM drives is the trusted safe approach.

 

Going forward, I think the next time we upgrade the NAS I'll get the 8 drive 628x model....for random smile read/write ops, spindle count will be more important than RPM or RAID level and that is really the major area of performance concern w/ these units.  It would be nice if Netgear also sold them w/ more ECC RAM - It can't be too expensive to put a few more 8GB ECC dimms in each chassis. 

Message 8 of 9
mdgm-ntgr
NETGEAR Employee Retired

Re: Dreaded Raid5 Nightmare - Another drive fails during rebuild of another failed drive

You can use RAID-10 if you want. You can use RAID-50 if you prefer. The 8-bay supports RAID-60 as well.

Message 9 of 9
Top Contributors
Discussion stats
  • 8 replies
  • 2867 views
  • 2 kudos
  • 4 in conversation
Announcements