NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
Platypus69
Jun 13, 2018Luminary
Why does ReadyNAS recommend not to use RAID 5 with 8 disks
So I have purchased a new RN428. Populated it with 8 x 12TB IronWolf HDDs (supported via HCL).
So the RN428 during boot auto-created a RAID 6 volume.
I am really after RAID 5.
So when I destroyed the RAID 6 volume and went to create a RAID 5 volume the ReadyNAS warned me that it was not recommended that you implement RAID 5 with 8 HDDs.
Can anyone tell me why?
Is there a technical reason?
Or mathematically a substantially increased chance of multiple disk failure?
TIA
(PS: I understand the difference between RAID 5 and RAID 6.)
5 Replies
Replies have been turned off for this discussion
- StephenBGuru - Experienced User
Platypus69 wrote:
Is there a technical reason?
Or mathematically a substantially increased chance of multiple disk failure?
The latter. RAID resync will take quite a while with 8x12TB drives, since every sector in the data volume needs to be either read or written. That does increase the chance that a second disk will fail during the resync (after the first failed disk is replaced).
Netgear's view is reasonable, but clearly this is a topic where people will disagree. If you don't have a backup plan in place then I would certainly recommend RAID-6. But if you do have a backup plan (and you really should), then it's just balancing the RAID overhead against the higher chance of reloading the data from backup.
- Platypus69Luminary
Thanks!
So I found this article interesting: https://www.zdnet.com/article/why-raid-6-stops-working-in-2019/
And in particular:
"The crux of the problem RAID arrays are groups of disks with special logic in the controller that stores the data with extra bits so the loss of 1 or 2 disks won't destroy the information (I'm speaking of RAID levels 5 and 6, not 0, 1 or 10). The extra bits - parity - enable the lost data to be reconstructed by reading all the data off the remaining disks and writing to a replacement disk.
The problem with RAID 5 is that disk drives have read errors. SATA drives are commonly specified with an unrecoverable read error rate (URE) of 10^14. Which means that once every 200,000,000 sectors, the disk will not be able to read a sector.2 hundred million sectors is about 12 terabytes. When a drive fails in a 7 drive, 2 TB SATA disk RAID 5, you'll have 6 remaining 2 TB drives. As the RAID controller is reconstructing the data it is very likely it will see an URE. At that point the RAID reconstruction stops.
Here's the math: (1 - 1 /(2.4 x 10^10)) ^ (2.3 x 10^10) = 0.3835
You have a 62% chance of data loss due to an uncorrectable read error on a 7 drive RAID with one failed disk, assuming a 10^14 read error rate and ~23 billion sectors in 12 TB. Feeling lucky?"
But is this true? Sure from a mathematical perspective, but:
- Does RAID reconstruction stop if an URE occurs? I would have though it retries a number of times and then "continues on error". Can any one at Netgear confirm the behaviour in this case?
- Doesn't scrubbing help with read errors / bit rot? I test disks monthly. And scrub volumes weekly. Should this not help with preventing/mitigating UREs?
- Modern disks have a higher throughput then these old articles assumed. So the time taken to rebuild a 12TB HDD will be much quicker than what they were predicting, no?
In my case I only ever add files, never modify or delete. Think videos and photos.
It would be good to get Netgear's perspective and experience on this...
- StephenBGuru - Experienced User
Platypus69 wrote:
But is this true?
I don't work for Netgear, but it's obviously not true. 7x2TB arrays are quite small by current standards, and it's obvious to anyone here that when you resync an array of that size you aren't facing a 62% chance of seeing the volume fail.
The core of their analysis is that they assume that every read has a 10^14 chance of failing - so you are playing russian roulette with your data every time you read a sector.
But I don't think UREs have a fixed probability. Healthy new drives have a rate of 0, as the drive begins to fail the rate climbs very rapidly.
I also run the maintenance functions regularly - and I do think that running them will find failing disks more quickly. But Bitrot is a very different beast from UREs. Bitrot is when X was written but Y is read back. A URE is when the read fails (returning nothing)..
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!