NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
InteXX
Dec 29, 2014Luminary
XRAID2 vs RAID6, etc.
I've got an RN104 on the way and I'd like to better understand my configuration options.
To start with, I have to admit that I'm planning on using 3x2TB Caviar Greens to start with (WD20EARX); I know that they're not on the HCL. But they're what I have right now and I've just finished plowing 3 grand into a new server setup. The cash just isn't there to buy all new drives right now. I'll probably go with the 4TB Reds (WD40EFRX) and swap them in one at a time over the next couple months or so, as budget allows.
That said, I'm wondering about XRAID2 and its expansion capabilities. It's my understanding that the volume size will increase automatically after a sync and a reboot. Is this correct?
I like XRAID2's auto-expansion feature, but I also like RAID6's double-parity feature. Am I going to have to choose between the two? But I only have the three drives, which of course leaves RAID6 out.
So let's say go ahead and I order up my first Red and get started with 3Gs and 1R. I go with XRAID2. Do I then forfeit the double parity option? Can we mix-n-match the drive sizes like that? Also, I really like BTRFS' CoW and BitRot protection features. I'd hate to miss out on those.
What's my best plan here? Which way to go?
Thanks,
Jeff Bowman
Fairbanks, Alaska
To start with, I have to admit that I'm planning on using 3x2TB Caviar Greens to start with (WD20EARX); I know that they're not on the HCL. But they're what I have right now and I've just finished plowing 3 grand into a new server setup. The cash just isn't there to buy all new drives right now. I'll probably go with the 4TB Reds (WD40EFRX) and swap them in one at a time over the next couple months or so, as budget allows.
That said, I'm wondering about XRAID2 and its expansion capabilities. It's my understanding that the volume size will increase automatically after a sync and a reboot. Is this correct?
I like XRAID2's auto-expansion feature, but I also like RAID6's double-parity feature. Am I going to have to choose between the two? But I only have the three drives, which of course leaves RAID6 out.
So let's say go ahead and I order up my first Red and get started with 3Gs and 1R. I go with XRAID2. Do I then forfeit the double parity option? Can we mix-n-match the drive sizes like that? Also, I really like BTRFS' CoW and BitRot protection features. I'd hate to miss out on those.
What's my best plan here? Which way to go?
Thanks,
Jeff Bowman
Fairbanks, Alaska
15 Replies
Replies have been turned off for this discussion
- StephenBGuru - Experienced UserYou can add a 4 TB red, but you will waste 2 TB of its capacity until you have two of them installed. BTW, with RAID-6 you will waste 2 TB of capacity on the 4 TB drives until all 4 reds are installed.
CoW and Bit Rot protection are available in both xraid2 and RAID-6.
RAID-6 works out better on the RN300 and RN500 series. There is a performance hit for disk writes because the two parity blocks need to be computed, and the slower CPU in the RN100 makes that more significant. If you want to use RAID6 anyway, then change to flexraid before you install the fourth drive. There will then be an option to use the fourth disk for redundancy instead of additional space. After you get the RAID-6 volume synced, you can convert back to XRAID2 (and you won't lose the dual redundancy). Expansion with RAID-6 requires all 4 drives to be upgraded (switching back to xraid2 doesn't change that).
You need backups anyway, even with RAID-6 protection. So personally I'd go with single redundancy.
BTW, if you are planning to use the green drives in the NAS for the long term, you might want to set the head parking threshold (using the procedure you asked about here: viewtopic.php?f=11&t=74601). Opinions on whether this really matters vary. On paper it voids the WDC warranty. I've had some green drives in my NV+ for almost 5 years, and chose not to set the parking threshold. They continue to perform with no problems with load cycle counts over 500K. - InteXXLuminaryStephen, you really know your stuff on this. I'm really impressed with this community forum--everyone seems very helpful. Thank you :D
OK, my RN104 just arrived and I've set it to chasing rabbits!
Strange, though... it automatically selected a RAID5 under XRAID for me. I thought RAID5/RAID6 were FLEXRAID-only stuff? And how did I miss the option whether or not to choose XRAID2? The printed Systems Installation Guide from the box promises XRAID2, but the browser-based admin tool shows only XRAID.
So I assume this means that I'll have to do a factory default when I slide in that 4th drive, yes? If so I guess I'd better go ahead and order it up pronto.StephenB wrote: You can add a 4 TB red, but you will waste 2 TB of its capacity until you have two of them installed. BTW, with RAID-6 you will waste 2 TB of capacity on the 4 TB drives until all 4 reds are installed.
Ah, good tip, thanks. That's OK about the capacity loss though--I don't have that much data to begin with. But I am concerned about the auto-resizing. As I swap in the larger drives over time, will I have to do anything besides sit and wait for the volume to magically get bigger?StephenB wrote: CoW and Bit Rot protection are available in both xraid2 and RAID-6.
As noted above, things seem to pointing to the two as being synonymous; I must be misunderstanding something somewhere. So XRAID2 has CoW and BitRot--how 'bout double parity (in case of failure during a rebuild)?StephenB wrote: RAID-6 works out better on the RN300 and RN500 series. There is a performance hit for disk writes because the two parity blocks need to be computed, and the slower CPU in the RN100 makes that more significant.
I'm using the device for backup storage only, so write speed performance isn't all that important to me. All the same, I appreciate the heads-up.StephenB wrote: If you want to use RAID6 anyway, then change to flexraid before you install the fourth drive.
But doesn't FLEXRAID preclude auto-resizable volumes? And I'm still cornfused about how I managed to get a RAID5 going under XRAID. (Should be XRAID2?)
From reading some FAQs at the website, I was under the impression that it went something like this:BTRFS
---------------------
FLEX-RAID X-RAID2
----------- ---------
RAID5/6 X
Vol Resize X
No?StephenB wrote: There will then be an option to use the fourth disk for redundancy instead of additional space. After you get the RAID-6 volume synced, you can convert back to XRAID2 (and you won't lose the dual redundancy).
Oh, OK--I think I get what you're saying.- Switch to FLEXRAID until I have the 4th drive in hand (didn't know you could switch on the fly--thought it required a reset)
- Install the drive, let it sync
- Switch back to XRAID (should be XRAID2?)
- Live happily ever after
Am I reading you right?StephenB wrote: Expansion with RAID-6 requires all 4 drives to be upgraded (switching back to xraid2 doesn't change that).
So, just to clarify... I can have a RAID6 with the 3x2TB/1x4TB mismatched sizes and everything will work, but the volume won't get expanded to the drives' full combined capacity until I get all the 4x4TBs installed? (That was also another auto-expand question, snuck in there sideways.)StephenB wrote: You need backups anyway, even with RAID-6 protection.
This seems to be a recurring theme I'm seeing here on the boards. I thought the device itself was the backup. That's why I bought it, anyway, with that in mind. Am I missing something?StephenB wrote: So personally I'd go with single redundancy.
I get a really nice warm fuzzy from the idea of protection against failure during a rebuild. That's a biggie for me.StephenB wrote: BTW, if you are planning to use the green drives in the NAS for the long term, you might want to set the head parking threshold
Probably not... I want to get that volume size up asap, even if I'm not using a lot of it at first.StephenB wrote: Opinions on whether this really matters vary.
I'm presently a man without opinions. ReadyNAS opinions, that is ;)StephenB wrote: On paper it voids the WDC warranty. I've had some green drives in my NV+ for almost 5 years, and chose not to set the parking threshold. They continue to perform with no problems with load cycle counts over 500K.
Well that's good enough for me then. They're staying where they're at.
Looks like I wrote a book here. As usual, one answer spawns forty more questions. Your patience with my ignorance in these matters is appreciated.
Thanks,
Jeff Bowman
Fairbanks, Alaska - mdgm-ntgrNETGEAR Employee RetiredIf your PC is holding the primary copy of the data, then the copy on your NAS is the backup. However the NAS in itself is not a backup. If you store data only on the NAS then it is not backed up.
RAID provides redundancy/high-availability not backup.
Your NAS uses X-RAID2 but the UI calls it X-RAID.
By default an X-RAID volume with 3 or more equal sized (or where only the largest capacity disk has a different capacity) disks will be using RAID-5.
If you switch to Flex-RAID (disable X-RAID) you can choose to have the next disk added convert the volume to RAID-6, then after that re-enable X-RAID again.
Alternatively you could backup your data, switch to Flex-RAID, delete the volume and create a new RAID-6 volume (note a RAID-6 volume requires at least four disks), then re-enable X-RAID. - StephenBGuru - Experienced Usermdgm got most of your questions, but missed this one:
Normally a file system (ext, btrfs, ntfs, etc) is built on top of a disk partition. RAID sits between the the file system and multiple physical disks, so it acts like a "logical" disk drive. RAID applies parity protection (single or double) to the physical disks, and can be used with any file system. In fact RAID knows nothing about the file system structure. That is why rebuilding a empty RAID volume takes the same amount of time as a full one. Normally RAID repair is engaged only when there is a read error on the physical disk, or when a disk is replaced.InteXX wrote: StephenB wrote: CoW and Bit Rot protection are available in both xraid2 and RAID-6.
As noted above, things seem to pointing to the two as being synonymous; I must be misunderstanding something somewhere. So XRAID2 has CoW and BitRot--how 'bout double parity (in case of failure during a rebuild)?
CoW is an attribute of the btrfs file system. Netgear has chosen not to explicitly control CoW in their UI settings, it is enabled/disabled implicitly as part of the snapshots and bitrot protection features. But Cow can be used w/o RAID (it is still available even if you chose jbod on the RN104). It is the basis of the snapshot feature.
The Bitrot feature in OS 6.2.x is unique to Netgear. They haven't given a lot of details, but there are some hints here and there. btrfs includes optional checksums which it uses to verify file integrity. Apparently when the btrfs checksum fails, Netgear has some proprietary software which attempts to do a RAID repair (even though there was no read error). If one of the physical disks somehow got bad (but readable) data, then the bitrot should be able to restore the original data. So the Bitrot protection couples RAID to the file system checksum feature - so it is not completely independent as I described above. Bitrot protection requires RAID, but it should work with both flexraid and xraid. Netgear chose to enable CoW when you check the bitrot protection box, but they haven't explained why. Perhaps they also might look at snapshots when the checksum fails??? They haven't said.
If I am reading the tea leaves correctly on what Netgear has done, then there are scenarios where the bitrot protection approach wouldn't work. However, it seems like a useful idea to me, and given the high priority you are giving data integrity, you should enable it.
To make things more confusing, BTRFS itself has an experimental mode where a variant of RAID protection is fully integrated into the file system (and not layered at all). This experimental mode has its own bitrot protection feature. However, Netgear isn't using that mode. - InteXXLuminary
mdgm wrote: If your PC is holding the primary copy of the data, then the copy on your NAS is the backup. However the NAS in itself is not a backup. If you store data only on the NAS then it is not backed up.
RAID provides redundancy/high-availability not backup.
Your NAS uses X-RAID2 but the UI calls it X-RAID.
By default an X-RAID volume with 3 or more equal sized (or where only the largest capacity disk has a different capacity) disks will be using RAID-5.
If you switch to Flex-RAID (disable X-RAID) you can choose to have the next disk added convert the volume to RAID-6, then after that re-enable X-RAID again.
Alternatively you could backup your data, switch to Flex-RAID, delete the volume and create a new RAID-6 volume (note a RAID-6 volume requires at least four disks), then re-enable X-RAID.
You managed to condense my book into a pamphlet. Good man :-)
I have my plan now.
Thanks,
Jeff Bowman
Fairbanks, Alaska - InteXXLuminary
StephenB wrote: mdgm got most of your questions, but missed this one
Stephen, you truly are a ReadyNAS Maniac :)
I can see clearly now, the rain is gone.StephenB wrote: there are scenarios where the bitrot protection approach wouldn't work
You've got me curious. Care to discuss?
Thanks,
Jeff Bowman
Fairbanks, Alaska - StephenBGuru - Experienced User
Here's my thinking (rather long winded I'm afraid...)InteXX wrote: StephenB wrote: there are scenarios where the bitrot protection approach wouldn't work
You've got me curious. Care to discuss?
Let's start with the causes of bitrot. The disk itself has its own CRC codes saved for each block, which it verifies. It's pretty unlikely that the data block will simply change and still have a valid CRC (though given the amount of disk storage in use, I wouldn't claim it never has happened anywhere). So let's keep "spontaneous" bitrot on the list.
Other potential causes include
-a network error delivered wrong data to the NAS (with a valid ethernet CRC). No reason to think that the network can't rot data too.
-a memory failure in the NAS or the disk's own internal cache that created an error in the queued block before it was written
-a failure of some kind prevented the blocks from being written. For instance a power failure, or a disk controller bug.
-One or more disks with read failures are cloned, so there is bad data on the disks, but it all can be read.
Of course there could be more causes I am missing.
Some of these might not fit your definition of bitrot, but they all have the property that readable but wrong data ends up on the disk drive, so without knowing the cause they are indistinguishable from each other. And the disk cloning scenario is quite common with RAID array repair, so including it in the mix is worthwhile even if it isn't actually bitrot.
I'll assume RAID-5 but the reasoning easily extends to RAID6.
Taking these cases in turn:
(a)If the data the NAS was told to write was rotted by the network before it even reached the NAS, then clearly there isn't anything the NAS can do on its own to correct it. So the method surely fails in that case.
(b) If a memory failure in the NAS occurred before the parity block was recomputed, then there will be wrong data on the disk - but the parity can't help, because the parity was also computed using the same wrong data. If the memory failure corrupts the checksum, then the method also fails. But if the failure occurred after the parity block was computed and the checksum is valid, then the method will likely work.
(c)If the failure that prevented blocks from being written affects two or more blocks in the same group, then RAID repair can't help you either. If the correct checksum isn't written, then it also fails. If it affects just a single block in the group, then it likely works.
(d) In the cloned case, there is only one block in each group on the cloned disk. If the other blocks in each group (on the other disks) are intact, then the raid repair should work. If there is a read error or bitrot on another block in the group (e.g., a different disk is also ailing) then it will fail. Multiple disk failures do happen, there are plenty of posts here from users who had it happen to them.
If you needed to clone more than one disk, there are bad blocks on multiple disks. With luck, the bad blocks will be in different groups. In that case the method works. But if they aren't in different groups, then the technique fails.
In all cloned cases if a bad block holds the checksum the method fails to repair any other bad blocks covered by that checksum.
(e) going back to the original "spontaneous" rot case (where the data appears to change on its own on the disk drive), then the technique requires that
-there is no more than one bad (but readable) data block in the group, and all the other blocks (including parity) are readable and correct.
-the file checksum is correct.
The technique will fail when these conditions aren't met. For instance,
if two blocks in the group both rotted.
if one block and the checksum are rotted.
if only one block is rotted, but there is a read error on another block in the group.
if you do a raid scrub before you detect the checksum error, the scrub will "fix" the parity block using the rotted data, so the method will fail when the checksum error is detected later.
So my conclusions -
-most of the scenarios above sound pretty improbable (disk cloning aside, which is intentional). The one we are most likely to see in practice is a failure of one or more blocks to be written in the first place. If you had truly massive amounts of data (e.g, the amount google or facebook have), then seeing some of the other causes becomes more plausible.
-corollary: if the most probable cause is queued blocks not being written, then clean shutdowns lower the odds of bitrot substantially, and using a UPS helps ensure clean shutdowns.
-If you do need to clone disks as part of a RAID repair, you should restore as many files from other copies (e.g., backups) as you can after the volume is restored.
-Using a btrfs checksum to trigger raid repair will work some times, but will fail other times. It seems unlikely to make things worse in the cases where it fails. It's worth enabling, because reducing the amount of corruption in the volume is a good thing, even if you can't do it 100%. - InteXXLuminary
StephenB wrote: Here's my thinking (rather long winded I'm afraid...)
Not at all! The detail is refreshing.
I've long felt that any sort of copy/move job should include a final file-level checksum taken at both the source and the target and compared before the job is marked as successful. Given the disparity between source/target OSs, however, I realize this would be impractical. It's just a wish for the wish-list.
FWIW if I can manage to get TeraCopy working on my machine, I may have found a way to accomplish this and deal with at least the network bitrot problem. That is, assuming the promoted feature works as advertised.
Good call on the UPS. That's a must.
Cloning: maybe a separate database of CRCs for a comparison after sync? That said, I'm not sure what one would do in the event of a bad copy (assuming no other backup).
Whew! This is a tough problem, isn't it?
Thanks,
Jeff Bowman
Fairbanks, Alaska - StephenBGuru - Experienced UserTeracopy can verify, and I usually have it do that when I am copying to the NAS. Though of course if there is bitrot on the source machine's hard drive, a similar scenario still exists.
Having the CRCs on another device helps detect problems, which is useful even if you can't repair. At least you know what you lost. I use SFV files for that in my media folders - there's an SFV file in every folder, which checksums the media files (but not jpg artwork, info files, etc). I have a separate copy of the SFV files in a different share, and both are backed up.
It is a tough problem, and there have to be failure modes I missed. Some might be worse than the ones I listed (like losing the ext superblock). The good news is that the tech is quite reliable, so the failures are ultimately manageable even if there is some pain involved.
BTW There are some network coding techniques that are like RAID on steroids that apply to disk storage as well. There are some deployments, but I think it is still early days. Basically you can have as many parity blocks as you like (going far beyond RAID-6). There are a lot of potential applications (including dramatically speeding up networks that suffer packet loss). One of them is dispersing your parity blocks over multiple devices (disks or even servers) allowing you more repair options. Pretty cool stuff.
Here's a fairly recent article: http://www.networkworld.com/article/234 ... peeds.html.
And http://www.codeontechnologies.com/ - InteXXLuminaryThanks for the links, that's some interesting reading. BTW, I only became interested in all of this after spotting that (now-famous) Ars Technica article. When I saw that I started jumping up and down. A few more clicks and I just HAD to have a ReadyNAS :)
StephenB wrote: I use SFV files for that in my media folders - there's an SFV file in every folder
Ack! That sounds like a maintenance nightmare. How do you manage to keep up? Do you have some process that detects a disk write and then recomputes/creates the values?
Thanks,
Jeff Bowman
Fairbanks, Alaska
Related Content
NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!