NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
ChunkySocks
Nov 11, 2019Guide
Confused about the scrubbing function and bit rot protection feature
From reading ReadyNAS OS 6: An overview of relevant scheduled maintenance tasks available I'm a little confused about the following section: Should I use scrubbing regularly? Scrubbing is closel...
StephenB
Nov 11, 2019Guru - Experienced User
ChunkySocks wrote:
Should I use scrubbing regularly?
Scrubbing is closely connected to the bitrot protection feature. If bitrot protection is turned off, regular scrubbing is not required.
Is that correct?
Doesn't make sense to me. Surely it should read that if bitrot protection is turned off then regular scrubbing is required OR if bitrot protection is turned on, regular scrubbing is not required?
There's not much said about how the bitrot protection actually works, as it is a proprietary Netgear feature. Based on some hints posted here from time to time, I think it uses a combination of the RAID parity blocks and the BTRFS checksums.
Normally RAID recovery requires prior knowledge of what block needs to be recovered. But if you want to detect blocks that have silently gotten corrupted, you don't have any way to tell which block is bad. However, if you assume the BTRFS checksum is correctly stored (and if the checksum doesn't match what's on the disk), then you can attempt to recover every data block in the RAID stripe using the parity block, and see if one of those recovered blocks yields the correct checksum. Then assume that block suffered the bitrot. Note this possibility is pure speculation on my part - Netgear hasn't described what they actually do. But it is consistent with your link - since the algorithm I outlined depends on correctly written parity blocks.
Whether the scrub is "required" or not depends on your goals. It shouldn't be needed if the RAID array is healthy and error free. And as Netgear points out, it does generate a lot of disk I/O. Personally I run it every three months - because I think it also serves as a reasonable disk exerciser/diagnostic which could give me an early warning that a disk might be failing.
Sandshark
Nov 11, 2019Sensei
Maybe this will help: /Bit-rot-Protection-and-Copy-on-Write-COW-in-depth
Netgear has intimately linked bit rot protection and CoW. You cannot have CoW without bit rot protection. So if bit rot protection (and, consequenctly CoW) are disabled, you have no checksums a BTRFS scrub can use to detect and correct errors. So, scubs are basically useless.
- ChunkySocksNov 12, 2019Guide
Thanks for your replies.
If scrubs are useless then why does it allow you to start one without a warning if you don't have bit rot protection enabled and why did the one I ran a few days ago take 15 hours to complete? What was it doing in all that time?
I will double check my shares, perhaps bit rot / CoW is enabled on one or more of them though it's not something I recall ever enabling unless it is done by default.
- StephenBNov 12, 2019Guru - Experienced User
ChunkySocks wrote:
If scrubs are useless ...
They aren't useless, they just aren't "required" Said another way, Netgear is saying that you really should run them if you have the bitrot protection feature enabled. If you don't, then it's up to you.
There are two components: a RAID scrub and a BTRFS scrub.
The RAID scrub reads all the data and parity blocks in the volume. If a bad sector is uncovered, it will be re-written (and the sector will be spared by the drive). And of course that can detect a failing disk more quickly than just waiting around for the user to access the files with the bad sectors. In my own case, a lot of the files I have on the NAS aren't read very often.
If a parity block doesn't match the data, but there is no read error, then there is a mismatch. Normal RAID can't repair that because it has no way of figuring out which block is wrong. Generally it just rewrites the parity block. However, that ought to trigger Netgear's bitrot protection feature instead.
The BTRFS scrub reads all the data and metadata in the volume, and verifies the BTRFS checksums. Based on posts here, in some cases it does release some free space - though I don't know how that happens. But certainly it can find files that were corrupted somehow. And bad sectors uncovered by this scrub would also be rewritten and repaired from the RAID parity.
I'm not personally convinced that bitrot (a silent change to on-disk data that wasn't re-written) happens often enough for home NAS users to worry about it. But sectors do go bad sometimes (normally resulting in read errors). And unclean shutdowns or even bugs can cause cached writes to be lost. The BTRFS+RAID scrub can uncover bad sectors, and confirm volume integrity.
If bitrot protection is enabled, then the scrub might be able to repair some damage when writes are lost or when actual bitrot occurs. But even it can't, it is useful to know that there are issues with the volume integrity. If you replace a disk when you have latent (not yet found) bad sectors a different disk, then the resync will fail - and you will lose all your data.
So in my opinion there is value in checking the RAID parity and BTRFS checksums systematically every now and then. As I said earlier, I run them every three months on my own systems.
- SandsharkNov 13, 2019Sensei
I'm a bit confused by all of it, and it would be nice for Netgear to explain things better.
According to the BTRFS documentation, the nodatacow and nodatachecksum options are actually file attibutes, though you usually set them to be the default for a volume or subvolume. And one implies/requires the other. Netgear documentation also says they are linked.
But, then, in the Volume settings, there is a checkbox for "checksum". How that is linked to COW and bitrot protection is not explained. And nothing in the BTRFS documentation that I can find explains such a thing.
I do know that when I triggered a scrub on a non-COW share (of which I now have none) that it completed virtually instantaneously, which lead me to believe it did nothing of significant value.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!