NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

rcarr6502's avatar
May 19, 2018
Solved

Advice on mixing and interleaving disks, RAID 10?

I'm considering buying a ReadyNAS 628X and configuring it for RAID 10.  To increase the reliability of the array, I've been thinking of using two drive vendors, populating it with HGST NAS disks and ...
  • StephenB's avatar
    StephenB
    May 19, 2018

    rcarr6502 wrote:

     

    * RAID 5/6 is most dangerous when the array is degraded and rebuilding.  That's exactly when the 2nd disk has failed in my two experiences.  RAID 5/6 accelerates failure of remaining disks in a way that RAID-10 does not.

     

    * RAID-10 will rebuild its array much more quickly -- since it just has to copy missing data from a disk in one mirror to the other.

     

    * RAID-10 can survive a minimum of 1 disk failure up to N/2 disk failures.  It's true that if one disk in both (RAID-10) mirrors fail simultaneously, the array is dead.  But the chance that a second disk failure will take out the array should be 1/(n - 1)...

     

     

    made me strongly consider RAID-10, especially the observation that "In raid 6, during a rebuild it has to read every drive and recalculate the missing data.  That means you have to read [potentially terabytes and terabytes] of data to rebuild that and hope a URE doesn't occur."

     


    I'll start by saying the analysis in the "death of raid" articles seems too simplistic to me.  They make it sound like every disk read is like playing Russian Roulette - one chance in 10**14 of the URE bullet exploding your data.  I don't think UREs are like that.  Disk failures have a cause, they aren't just random events.  When a disk fails, the chance of URE rises to 100%.  When it's starting to fail, it rises very quickly from 0 to a much larger value.  It's not a static 1 in 10**14 crap shoot.

     

    I also don't think RAID 5/6 resync accelerates failures, though it is true that rebuilding the array requires either reading or writing every sector in it.   More on that below.

     

    Rebuilding RAID-10 is easier because it only requires mirroring one existing disk.  The array fails if that existing disk fails during resync - but the 1/(n-1) probability is misleading (and in my opinion incorrect).   Your "accelerate failures" concept is grounded in the idea that heavy disk I/O will create a failure in one of the remaining disks.  With RAID-10 resync, the source disk (and of course the new mirror) are the only two disks that experience heavy I/O.  So if that idea is correct, then the disk most likely to fail is in fact the source disk of the mirror.

     

    I'm not really sold on that concept though.  I think that when disks begin to fail, sectors silently become unreadable or unwriteable.  But (at least with my data) most of the sectors are only rarely read or written - so those failures aren't detected right away.  Then when a disk is detected as having failed, you replace it - and then discover there's other failures you didn't know about when the raid resync reads (or writes) everything.

     

    So I don't think the RAID resync creates the failures - I think most of the time it uncovers failures that have already occured.  I run the scheduled maitenance functions to try and detect those failures early.

     

    Another observation - RAID-6 rebuild can survive UREs when you replace a single disk (because it has dual redundancy).  Where it breaks down is if you have two or more UREs in the same stripe. 

     

    In my own experience, the odds of losing a RAID-5 array during resync are fairly small - certainly it happens sometimes, but not that often.  In fact, I've never lost one that way.  But there's always some chance your RAID array will fail, no matter what RAID mode you use - the defense against that is to have backups.  

     

    But if I were trying to solve the issues you are worried about, I think I'd go with multiple RAID-1 volumes instead of RAID-10.  The resync process is the same with multiple RAID-1 as it is with RAID-10.  But recovering data from RAID-1 is much easier than recovering data any of the other RAID modes.  Plus it'd be much easier to increase storage (since you'd only need to offload and restore 1/4 of the data). 

     

     

    Retired_Member wrote:

    Used WD in the past, but switched to HGST, which show 10 to 15% better performance. They are a bit warmer during standard operation, but give the higher throughput and are more reliable to my experience. Well, do not mix them with WD, though.

    Again, it's fine to mix them - you just won't get the performance gain.  You could of course mix the HGST with other enterprise class drives, and then you would get the performance improvement.

     

    HGST drives have a good reputation, and as far as I can tell, the folks here who've used them are quite happy with them.  Personally I've found the WD Reds to be quite reliable - one or two failures since I started using them back in 2012.  At the moment I have 14 in service. 

     

     

     

     

     

     

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More