I understand the concept, but am curious how it is actually implemented on 6.2.0

[quote="http://en.wikipedia.org/wiki/Copy-on-write":1cqyxbmf]Copy-on-write (sometimes referred to as "COW") is an optimization strategy used in computer programming. Copy-on-write stems from the understanding that when multiple separate tasks use initially identical copies of some information (i.e., data stored in computer memory or disk storage), treating it as local data that they may occasionally need to modify, then it is not necessary to immediately create separate copies of that information for each task. Instead they can all be given pointers to the same resource, with the provision that on the first occasion where they need to modify the data, they must first create a local copy on which to perform the modification (the original resource remains unchanged). [/quote:1cqyxbmf]This is the core idea. When snapshots are taken, the snapshot folder and the main folder both initially have pointers to the same data on the disk. The btrfs wiki calls this "cloning" to distinguish it from linux hard links. When the file is modified, then the file is fragmented, so the unchanged blocks remain referenced by both folders. For the blocks that have been changed, the original block ends up referenced only by the snapshot, and the changed block is referenced by the main folder.When you have multiple snapshots, the idea is simply extended to cover them all.CoW is not limited to snapshots, there is a --reflink option in the cp command which has the same properties. Initially the two copies share the same datablocks, but as the files are modified only the shared blocks remain in common - resulting in fragmentation, but efficient use of disk space.I'm not sure why Netgear linked bit-rot protection to CoW, it is an odd admixture. From what little has been posted here, bit-rot protection depends on btrfs file checksums and RAID, not CoW.The obvious use of CoW is to create snapshots, which is a space-efficient mechanism that allows you to roll back to previous versions of the files. If you have large files with a few differences between them, then CoW could be used (e.g. cp --reflink) to reduce disk space. If you have a folder structure that contains source code, CoW is one way to create a development branch - again one that is space efficient. It isn't well suited to files that are being continuously updated (for instance torrent files being downloaded, or databases that are always changing). Snapshots and cp --reflink are very fast operations; the performance hit happens later on when the files are modified.

Yea, I guess I was more curious what the mechanism is; is it a BTRFS feature? Something else?

Yea, thanks. exactly what I was getting at...

How does bitrot protection actually work?

45 Replies

Replies have been turned off for this discussion

Nhellie
Virtuoso
Dec 10, 2014
As per release notes, "Support bitrot data protection. Automatically detect and correct corruption due to media degradation.", for me this means that as soon as you enable BitRot protection, the NAS will scan-detect-fix corruptions on the data.
ScottChapman
Apprentice
Dec 10, 2014
Yea, I guess I was more curious what the mechanism is; is it a BTRFS feature? Something else?

StephenB

Guru - Experienced User

Dec 10, 2014

Nhellie wrote:
As per release notes, "Support bitrot data protection. Automatically detect and correct corruption due to media degradation.", for me this means that as soon as you enable BitRot protection, the NAS will scan-detect-fix corruptions on the data.

Nhellie wrote:
As per release notes, "Support bitrot data protection. Automatically detect and correct corruption due to media degradation.", for me this means that as soon as you enable BitRot protection, the NAS will scan-detect-fix corruptions on the data.

Yes. But the "how" hasn't been disclosed, and that is what Scott is asking. We know its not using the btrfs experimental modes, so it appears to be a proprietary technique Netgear implemented that does something similar.

It would be useful to have more information, so people will have a better idea what it can/can't do. It could be quite useful in some circumstances (for instance reducing data loss when disk cloning is needed). But its hard to know, w/o some explanation.

ScottChapman
Apprentice
Dec 10, 2014
Yea, thanks. exactly what I was getting at...
snakyjake
Tutor
Dec 21, 2014
I'd like to know too. I currently run checksums on all my files, and store the results. Periodically I recalc the checksums, and compare against the stored checksum. This at least tells me there's been bit corruption. The next trick is to restore a good copy. You know you have a good copy when the copy matches the original checksum. What you don't want to have done is backed up the rotted file, and all you have left is that rotted file. ReadyNAS has a versioning backup feature. So as time goes on, and bits are changed, ReadyNAS probably has multiple backups of both the good and bad file. So I'm guessing ReadyNAS either prevented that from happening in the first place, or is able to locate the good file (by comparing checksums).

This is a pretty important topic for me. I'm concerned a lot about silent errors.

It would be great if Netgear would do a good write up demonstrating some scenarios. I don't need to understand how, but I need to see the scenario proven. For example, use a hex editor and change the file (but keep the file size/date the same). ReadyNAS should detect the change, and should restore the file. The other scenario is the reverse...what happens when the stored checksum is corrupted?

Jake
snakyjake
Tutor
Dec 21, 2014
After looking at the different models, I don't have a lot of confidence in the cheaper non-ECC models. For a prosumer, it gets quite expensive.
StephenB
Guru - Experienced User
Dec 21, 2014
snakyjake wrote:
After looking at the different models, I don't have a lot of confidence in the cheaper non-ECC models. For a prosumer, it gets quite expensive.
Well, of course the ECC Ram costs more for Netgear to buy, and on top of that you need other system components that support the RAM. The lower end of their market is price sensitive (it always is), so I think from a business point of view it likely isn't viable for them to include ECC there.

For better or worse, most consumers simply aren't as concerned about silent bit rot as you are, so ECC is not a must-have feature for them. If enterprises started demanding it in all devices, that would likely change - but I haven't seen much that suggests that will happen anytime soon.

That said, I think Netgear should describe what they are doing in more detail.
nsne
Virtuoso
Dec 29, 2014
I'm still a bit puzzled about the basics of CoW. I understand it has the potential to be A Good Thing®, but I'm not sure what the optimal operating conditions would be.

For example, is CoW good — or even necessary — for media shares with lots of large (ca. 2GB) video files that don't get modified very often? Is it for documents? How about an iTunes share that contains a little bit of everything and is accessed and updated (with apps, podcasts) daily? How much of a performance hit will it bring to a 314?

I disabled bit-rot protection across the board long ago when I was trying to eke some decent performance out of the 314, but I'd also like to ensure data integrity if the feature set allows it.
mdgm-ntgr
NETGEAR Employee Retired
Dec 29, 2014
Features such as CoW and bitrot protection can be very useful for files that are not modified often. Bitrot protection provides good protection against media degradation.

I guess through a bit of trial and error you may find what works for you. Obviously there is a performance hit. So whether you use them does depend on how much you value extra protection for your data compared with performance.

With settings at a per share level you can choose which shares to use these features with.

It would be advisable to make use of scheduled volume maintenance.
StephenB
Guru - Experienced User
Dec 29, 2014
[quote="http://en.wikipedia.org/wiki/Copy-on-write":1cqyxbmf]Copy-on-write (sometimes referred to as "COW") is an optimization strategy used in computer programming. Copy-on-write stems from the understanding that when multiple separate tasks use initially identical copies of some information (i.e., data stored in computer memory or disk storage), treating it as local data that they may occasionally need to modify, then it is not necessary to immediately create separate copies of that information for each task. Instead they can all be given pointers to the same resource, with the provision that on the first occasion where they need to modify the data, they must first create a local copy on which to perform the modification (the original resource remains unchanged). [/quote:1cqyxbmf]
This is the core idea. When snapshots are taken, the snapshot folder and the main folder both initially have pointers to the same data on the disk. The btrfs wiki calls this "cloning" to distinguish it from linux hard links.

When the file is modified, then the file is fragmented, so the unchanged blocks remain referenced by both folders. For the blocks that have been changed, the original block ends up referenced only by the snapshot, and the changed block is referenced by the main folder.

When you have multiple snapshots, the idea is simply extended to cover them all.

CoW is not limited to snapshots, there is a --reflink option in the cp command which has the same properties. Initially the two copies share the same datablocks, but as the files are modified only the shared blocks remain in common - resulting in fragmentation, but efficient use of disk space.

I'm not sure why Netgear linked bit-rot protection to CoW, it is an odd admixture. From what little has been posted here, bit-rot protection depends on btrfs file checksums and RAID, not CoW.

The obvious use of CoW is to create snapshots, which is a space-efficient mechanism that allows you to roll back to previous versions of the files. If you have large files with a few differences between them, then CoW could be used (e.g. cp --reflink) to reduce disk space. If you have a folder structure that contains source code, CoW is one way to create a development branch - again one that is space efficient. It isn't well suited to files that are being continuously updated (for instance torrent files being downloaded, or databases that are always changing). Snapshots and cp --reflink are very fast operations; the performance hit happens later on when the files are modified.