NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
Matthias1111
Jan 10, 2021Aspirant
Volume dead or inactive after balancing - works with readonly
Hello,
my ReadyNAS 104 reports that the Volume is inactive or dead after I performed "balancing disks". The data volume is accessible in read-only mode (boot menu), but I am really concerned about the root cause, since all Disks (3 HDDs as Raid 5) are reported healthy.
Is there anybody who can take a look to the logs (which I would send via PM). I really would appriciate that.
Thanks in advance
Matthias
Thanks for the logs Matthias1111
The NAS experienced several out of memory conditions which likely caused the issue/crash in the end.
It also seems to be induced by quota calculation. Example below. This happened over and over, by the way.Jan 10 00:12:18 NAS kernel: Hardware name: Marvell Armada 370/XP (Device Tree) Jan 10 00:12:18 NAS kernel: [<c0015270>] (unwind_backtrace) from [<c001173c>] (show_stack+0x10/0x18) Jan 10 00:12:18 NAS kernel: [<c001173c>] (show_stack) from [<c03849d0>] (dump_stack+0x78/0x9c) Jan 10 00:12:18 NAS kernel: [<c03849d0>] (dump_stack) from [<c00d5e20>] (dump_header+0x4c/0x1b4) Jan 10 00:12:18 NAS kernel: [<c00d5e20>] (dump_header) from [<c00a09a0>] (oom_kill_process+0xd0/0x45c) Jan 10 00:12:18 NAS kernel: [<c00a09a0>] (oom_kill_process) from [<c00a10b0>] (out_of_memory+0x310/0x374) Jan 10 00:12:18 NAS kernel: [<c00a10b0>] (out_of_memory) from [<c00a49d4>] (__alloc_pages_nodemask+0x6e0/0x7dc) Jan 10 00:12:18 NAS kernel: [<c00a49d4>] (__alloc_pages_nodemask) from [<c00cb4c0>] (__read_swap_cache_async+0x70/0x1a0) Jan 10 00:12:18 NAS kernel: [<c00cb4c0>] (__read_swap_cache_async) from [<c00cb600>] (read_swap_cache_async+0x10/0x34) Jan 10 00:12:18 NAS kernel: [<c00cb600>] (read_swap_cache_async) from [<c00cb788>] (swapin_readahead+0x164/0x17c) Jan 10 00:12:18 NAS kernel: [<c00cb788>] (swapin_readahead) from [<c00bd4fc>] (handle_mm_fault+0x83c/0xc04) Jan 10 00:12:18 NAS kernel: [<c00bd4fc>] (handle_mm_fault) from [<c0017cb8>] (do_page_fault+0x134/0x2b0) Jan 10 00:12:18 NAS kernel: [<c0017cb8>] (do_page_fault) from [<c00092b0>] (do_DataAbort+0x34/0xb8) Jan 10 00:12:18 NAS kernel: [<c00092b0>] (do_DataAbort) from [<c00123fc>] (__dabt_usr+0x3c/0x40) Jan 10 00:12:18 NAS kernel: Out of memory: Kill process 1113 (mount) score 1 or sacrifice child Jan 10 00:12:18 NAS kernel: Killed process 1113 (mount) total-vm:5400kB, anon-rss:0kB, file-rss:1764kB Jan 10 00:12:18 NAS kernel: mount: page allocation failure: order:0, mode:0x2600040 Jan 10 00:12:18 NAS kernel: CPU: 0 PID: 1113 Comm: mount Tainted: P O 4.4.190.armada.1 #1 Jan 10 00:12:18 NAS kernel: Hardware name: Marvell Armada 370/XP (Device Tree) Jan 10 00:12:18 NAS kernel: [<c0015270>] (unwind_backtrace) from [<c001173c>] (show_stack+0x10/0x18) Jan 10 00:12:18 NAS kernel: [<c001173c>] (show_stack) from [<c03849d0>] (dump_stack+0x78/0x9c) Jan 10 00:12:18 NAS kernel: [<c03849d0>] (dump_stack) from [<c00a2570>] (warn_alloc_failed+0xec/0x118) Jan 10 00:12:18 NAS kernel: [<c00a2570>] (warn_alloc_failed) from [<c00a4a44>] (__alloc_pages_nodemask+0x750/0x7dc) Jan 10 00:12:18 NAS kernel: [<c00a4a44>] (__alloc_pages_nodemask) from [<c00d0d58>] (allocate_slab+0x88/0x280) Jan 10 00:12:18 NAS kernel: [<c00d0d58>] (allocate_slab) from [<c00d253c>] (___slab_alloc.constprop.13+0x250/0x35c) Jan 10 00:12:18 NAS kernel: [<c00d253c>] (___slab_alloc.constprop.13) from [<c00d2828>] (kmem_cache_alloc+0xac/0x168) Jan 10 00:12:18 NAS kernel: [<c00d2828>] (kmem_cache_alloc) from [<c0306114>] (ulist_alloc+0x1c/0x54) Jan 10 00:12:18 NAS kernel: [<c0306114>] (ulist_alloc) from [<c03040d0>] (resolve_indirect_refs+0x1c/0x6d4) Jan 10 00:12:18 NAS kernel: [<c03040d0>] (resolve_indirect_refs) from [<c0304b54>] (find_parent_nodes+0x3cc/0x6b0) Jan 10 00:12:18 NAS kernel: [<c0304b54>] (find_parent_nodes) from [<c0304eb8>] (btrfs_find_all_roots_safe+0x80/0xfc) Jan 10 00:12:18 NAS kernel: [<c0304eb8>] (btrfs_find_all_roots_safe) from [<c0304f7c>] (btrfs_find_all_roots+0x48/0x6c) Jan 10 00:12:18 NAS kernel: [<c0304f7c>] (btrfs_find_all_roots) from [<c03089ec>] (btrfs_qgroup_prepare_account_extents+0x58/0xa0) Jan 10 00:12:18 NAS kernel: [<c03089ec>] (btrfs_qgroup_prepare_account_extents) from [<c029b714>] (btrfs_commit_transaction+0x49c/0x9b4) Jan 10 00:12:18 NAS kernel: [<c029b714>] (btrfs_commit_transaction) from [<c0284ac4>] (btrfs_drop_snapshot+0x420/0x6bc) Jan 10 00:12:18 NAS kernel: [<c0284ac4>] (btrfs_drop_snapshot) from [<c02f6338>] (merge_reloc_roots+0x120/0x220) Jan 10 00:12:18 NAS kernel: [<c02f6338>] (merge_reloc_roots) from [<c02f7138>] (btrfs_recover_relocation+0x2c8/0x370) Jan 10 00:12:18 NAS kernel: [<c02f7138>] (btrfs_recover_relocation) from [<c0298f00>] (open_ctree+0x1df0/0x2168) Jan 10 00:12:18 NAS kernel: [<c0298f00>] (open_ctree) from [<c026f578>] (btrfs_mount+0x458/0x690) Jan 10 00:12:18 NAS kernel: [<c026f578>] (btrfs_mount) from [<c00dbdc0>] (mount_fs+0x6c/0x14c) Jan 10 00:12:18 NAS kernel: [<c00dbdc0>] (mount_fs) from [<c00f4490>] (vfs_kern_mount+0x4c/0xf0) Jan 10 00:12:18 NAS kernel: [<c00f4490>] (vfs_kern_mount) from [<c026ea68>] (mount_subvol+0xf4/0x7ac) Jan 10 00:12:18 NAS kernel: [<c026ea68>] (mount_subvol) from [<c026f2f4>] (btrfs_mount+0x1d4/0x690) Jan 10 00:12:18 NAS kernel: [<c026f2f4>] (btrfs_mount) from [<c00dbdc0>] (mount_fs+0x6c/0x14c) Jan 10 00:12:18 NAS kernel: [<c00dbdc0>] (mount_fs) from [<c00f4490>] (vfs_kern_mount+0x4c/0xf0) Jan 10 00:12:18 NAS kernel: [<c00f4490>] (vfs_kern_mount) from [<c00f71a4>] (do_mount+0xa30/0xb60) Jan 10 00:12:18 NAS kernel: [<c00f71a4>] (do_mount) from [<c00f74fc>] (SyS_mount+0x70/0xa0) Jan 10 00:12:18 NAS kernel: [<c00f74fc>] (SyS_mount) from [<c000ec40>] (ret_fast_syscall+0x0/0x40)
It is a quite well established fact that quotas carry a lot of calculation overhead when deleting snapshots. The RN104 has 512MB of RAM so it is already resource starved and so the act of deleting many snapshots in a row or big snapshots, can tip the unit over the edge.You can run into a race condition where the filesystem need to update (do btrfs-transactions) but the quota module has hogged all the resources and then you can end up in a limbo. The OS re-install disabled quotas (which I didn't know it would) and that likely allowed the filesystem to actually finish the clean-up. It explains why the OS re-install "fixed" it... It didn't really fix anything - it just disabled quotas and that allowed room for other filesystem transactions to take place.
I would advise to keep a lower amount of snapshots in general on these units. Disabling quotas before you delete snapshots or when running things like a balance, will help keeping the unit afloat. And then just re-enable quotas afterwards.
12 Replies
Replies have been turned off for this discussion
- StephenBGuru - Experienced User
If the volume is read-only, then you should immediately make sure you have an up-to-date backup. The data is definitely at risk.
You can send a download link to the logs via PM (private message) to one of the mods ( JohnCM_S or Marc_V ). Others here might also offer to take a look. You might start by looking in system.log and kernel.log for btrfs and disk i/o errors.
- Matthias1111Aspirant
Hello Stephen,
thanks for the fast reply. I asked Marc for help.
Best Matthias
- Matthias1111Aspirant
One addtional remark: Since I have backups from my NAS data, I now tried the bootmenu option "reinstall OS". This worked fine and the volume is back online now. The only thing is that in the capacity bar the yellow part for the snapshot consumption is missing (see screenshot). But snapshots are still there, accessible and new ones are taken automaticly like configured. What do you think? Is my NAS healthy, or should I factory reset it (It would prefere to avoid that because it consumes a lot of time to configure the NAS and transfer all the data back to the NAS).
Related Content
NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!