NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
kevinb_vr
Jul 11, 2017Tutor
RN3312 BTRFS operations are completely hung
We have owned the RN3312 for a bit over 6 months, and all was seemingly fine. However, things went downhill recently and now pretty much the entire BTRFS partition is completely unusable at this poin...
jak0lantash
Jul 11, 2017Mentor
kevinb_vr wrote:
What has happened is a combination of the Bit Rot Protection / COW + Compression + Snapshots being turned on, on a partition used for file backups, and image backups (Veeam) for a single, large, fileserver. BTRFS is NOT production ready for such a setup, I firmly believe this option should be removed from the UI, or a huge warning displayed.
Veeam already provides block level versioning. It's counter productive to take snapshots on Veeam backups. Veeam incremental images are compressed, so BTRFS Compression is most likely also unnecessary. Bit Rot Protection implies fragmentation, so running defrag + balance on schedule is advisable.
If btrfs-cleaner is currently taking 100% (a load average of 115.21 is huge), I don't think it's a good idea to try start BTRFS balance or defrag on top of that. Power-cycling (in normal mode or read-only mode via the Boot Menu) risks BTRFS corruption, but you may have no other choice at some point. What do you see in dmesg?
If the following command gives an output (logged in as root), please paste it here (replace "data" by the name of your volume if different):
btrfs fi us /data
BTRFS has been hugely improved during the last few years. And there were many "bug" fixes and implementation improvements with new firmware version on the ReadyNAS. I don't think half of the old threads you've read through are still applicable. I remember about the introduction of BTRFS quotas on 6.4.0 impacting heavily fragmented volumes, with some improvements on 6.4.1. There was also some other firmwares where the BTRFS balance was longer than expected.
In general, I tend to recommend disabling the BTRFS quotas at the volume level (in the settings of the volume) if you notice slowness in this type of tasks.
After doing some assessment on the metadata allocation, etc., you have a few options, which are mostly available after a power cycle (at the risk of BTRFS corruption). Booting the NAS in Volume Read-Only mode via the Boot Menu could allow you to perform a full backup of the data and update the F/W to 6.7.5. After that, you can try to reboot in normal mode, immediately disable the quotas in the settings of the volume, etc. But let's check dmesg and btrfs fi us /data first.
You can also do paid Support with NETGEAR, with a Per Incident contract or a yearly contract.
BTRFS is mostly misunderstood and misused.
I think that Bit Rot Protection, Compression and Snapshots are disabled by default when creating a new share (at least on recent firmwares). Before "checking all the boxes and starting everything", one would be expected to do some research to understand what is the feature and its implications/limitations. The biggest issue with ReadyNAS and BTRFS is that this type of information is not very easy to access. I do think that the documentation could be much better and the implications of each feature explained. It shouldn't be expected from users to be experts in every software components used on the NAS.
https://kb.netgear.com/26091/Bit-rot-Protection-and-Copy-on-Write-COW-in-depth
I couldn't find a single NETGEAR KB article explaining about huge metadata allocation, and you can't see it from the GUI either.
It's also not possible to control the level the balance runs at from the GUI (for example to run a balance on empty metadata chunks).
kevinb_vr wrote:We have owned the RN3312 for a bit over 6 months
I struggle to think what would happen if we filled up all 12 slots...
The correct product name of your ReadyNAS is RR3312.
- kevinb_vrJul 12, 2017Tutor
Hi, thanks for your reply!
The biggest problem I guess I see is there is no way to go 'back' after creating the volume, and then it getting into a state where it is pretty messed up. In the ReadyNAS admin screen, the options to remove compression or bit rot protection don't do anything, as the admin web UI just hangs spinning forever as it is waiting for something in the background.
I consider myself somewhat experienced with Linux administration (though maybe not so much with btrfs...) this was one of the reasons I liked the ReadyNAS line so much, a nice UI frontend plus solid Linux internals! But I wouldn't expect an average NAS user to be able to do this level of troubleshooting to try and get things going smoothly again. I also think it is not fair to the support team or to the users to require a support contract to fix these sorts of issues... that's why I lean on some sort of a UI or documentation fix to prevent them in the first place...
This share is used for both File Level backups (using BackupAssist, which makes rsnapshot-looking backup directories, and uses NTFS hard links?) and also for block/incremental backups using Veeam, both backing up a single fileserver. So this is why the Compression was turned on originally, I admit it would probably have been better to setup a separate share instead of just a separate directory once Veeam needed a backup target to use...
After checking dmesg, looks pretty clean, except I guess it might have been attempting to drop an old snapshot at boot still? There are only some hung-task notifications which seem to have resolved themselves... See full dmesg log here: https://pastebin.com/9mbkZe67
btrfs-transacti did keep going and going, however around 8 hours later it looks like it finally finished whatever it was doing after over 500 load avg. But then readynasd got unblocked and is now at 100%, locking up the UI and not allowing web access (sigh...) so I am going to try rebooting and see if I can get a clean startup.
root@archive:/data# btrfs fi us /data Overall: Device size: 36.36TiB Device allocated: 15.24TiB Device unallocated: 21.13TiB Device missing: 0.00B Used: 14.52TiB Free (estimated): 21.71TiB (min: 11.15TiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 232.53MiB) Data,single: Size:14.95TiB, Used:14.37TiB /dev/md127 14.95TiB Metadata,DUP: Size:145.50GiB, Used:80.66GiB /dev/md127 291.00GiB System,DUP: Size:8.00MiB, Used:1.97MiB /dev/md127 16.00MiB Unallocated: /dev/md127 21.13TiB
root@archive:/data# btrfs fi df /data Data, single: total=14.95TiB, used=14.37TiB System, DUP: total=8.00MiB, used=1.97MiB Metadata, DUP: total=145.50GiB, used=80.66GiB ** maybe 80 GB is OK for 14 TB of space used? ** GlobalReserve, single: total=512.00MiB, used=232.53MiB
root@archive:/data# btrfs fi usage /data Overall: Device size: 36.36TiB Device allocated: 15.24TiB Device unallocated: 21.13TiB Device missing: 0.00B Used: 14.52TiB Free (estimated): 21.71TiB (min: 11.15TiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 3.00MiB) Data,single: Size:14.95TiB, Used:14.37TiB /dev/md127 14.95TiB Metadata,DUP: Size:145.50GiB, Used:80.66GiB /dev/md127 291.00GiB System,DUP: Size:8.00MiB, Used:1.97MiB /dev/md127 16.00MiB Unallocated: /dev/md127 21.13TiB
The other thing is, I don't exactly know what readynasd is looking for itself - is it OK to disable quotas by just going ahead and running 'btrfs quota disable /data'? and not mess up the ReadyNAS UI? or should I try harder to go through the UI?
Is a reboot performed with the command 'reboot' from SSH or over serial? Would you happen to know if there is any output after [ OK ] Reached target Shutdown. ? Has it unmounted all the filesystems yet or should I just let it keep trying? This is via the serial console btw. Not sure if it is a real hang, systemd quirk, or is just busy trying to perform that last 'sync' before unmounting.
Thanks for your help, I really appreciate it!
Related Content
NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!