NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
btrfs
6 TopicsReadyNAS RN214 corruption on BTRFS
4x4TB in X-RAID, holds about 4TB in daily data and 1TB in daily snapshots. Used to work fine, when suddenly last Saturday it crashed. The shares disappeared. In a move to find a solution (and since I'm very familiar with Linux), I opened ssh access and I logged in with ssh. dmesg was reporting corruption on the roots of BTRFS on md127. It does not reported hardware problem with the hard disks thought. It seemed a software issue (corruption) with btrfs. I googled it a lot and tried whatever I could find in order to solve it. After hours on working with btrfsck, I managed to be able to "see" the filesystem on the device, but now, after reboot it reports: "BTRFS error (device md127): qgroup generation mismatch, marked as inconsistent" and mounts md127 (with the snapshots) read only. I tried to fix it with a couple of btrfsck commands without success. The data are there and I can read them. I can take them a backup (I already have the data on an USB disk). The best solution is to find a way to fix the corruption on md127 as it is now. Alternatively I was thinking to factory reset it and then restore configuration and data. But in this case I will need a way to backup (and then restore) the snapshots also, but as snapshots and not as data copies. Maybe something like "format a big enough external USB in btrfs and do this and that to take a full backup - with snapshots- on it, and after factory reset, restore from it. But I'm not sure how to do this. Any ideas ?6.7KViews0likes15Comments6.8.0 Kernel warning
Hello all, There is something wrong in the was the 6.8.0 has been compiled. 6.7.5 has no such issues. Since this is a problem in the BTRFS module I believe this should be fixed asap in order to gice us the peace of mind that our data remains secure. Linux netgear-nas 4.4.79.alpine.1 #1 SMP Mon Jul 31 15:19:36 PDT 2017 armv7l GNU/Linux [20904.832521] ------------[ cut here ]------------ [20904.832536] WARNING: CPU: 0 PID: 4707 at fs/btrfs/qgroup.c:2480 btrfs_qgroup_free_refroot+0x180/0x1a0() [20904.832542] Modules linked in: vpd(PO) [20904.832555] CPU: 0 PID: 4707 Comm: kworker/u2:8 Tainted: P W O 4.4.79.armada.1 #1 [20904.832562] Hardware name: Marvell Armada 370/XP (Device Tree) [20904.832572] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [20904.832589] [<c0015f24>] (unwind_backtrace) from [<c00120dc>] (show_stack+0x10/0x18) [20904.832603] [<c00120dc>] (show_stack) from [<c0392720>] (dump_stack+0x78/0x9c) [20904.832618] [<c0392720>] (dump_stack) from [<c0024928>] (warn_slowpath_common+0x74/0xac) [20904.832631] [<c0024928>] (warn_slowpath_common) from [<c002497c>] (warn_slowpath_null+0x1c/0x28) [20904.832644] [<c002497c>] (warn_slowpath_null) from [<c0318ae0>] (btrfs_qgroup_free_refroot+0x180/0x1a0) [20904.832658] [<c0318ae0>] (btrfs_qgroup_free_refroot) from [<c028a924>] (__btrfs_run_delayed_refs+0x328/0x113c) [20904.832672] [<c028a924>] (__btrfs_run_delayed_refs) from [<c02915c8>] (btrfs_run_delayed_refs+0x60/0x2b0) [20904.832685] [<c02915c8>] (btrfs_run_delayed_refs) from [<c0291c40>] (delayed_ref_async_start+0xa4/0xb0) [20904.832698] [<c0291c40>] (delayed_ref_async_start) from [<c02dbc28>] (normal_work_helper+0x84/0x1a4) [20904.832712] [<c02dbc28>] (normal_work_helper) from [<c0037f14>] (process_one_work+0x11c/0x334) [20904.832724] [<c0037f14>] (process_one_work) from [<c0038198>] (worker_thread+0x34/0x474) [20904.832736] [<c0038198>] (worker_thread) from [<c003d20c>] (kthread+0x104/0x124) [20904.832747] [<c003d20c>] (kthread) from [<c000f568>] (ret_from_fork+0x14/0x2c) [20904.832754] ---[ end trace b127ec432b47814a ]--- And these messages are being repeated in thei thousands.5.9KViews0likes6CommentsReadyNAS 2120 crash every little
Hello, We have about a dozen NAS ReadyNAS, half are model 2120 and 2120v2. These models give us problems that make them unusable every little time: balance, defrag ... that last up to 22 days and with a load higher than 10 in the system, so the NAS are not operational. We need to know what is the necessary configuration so that no problems (job scheduling, firmware version...) These NAS worked correctly until they reached a particular firmware version: what is the reason they are no longer functional? Greetings.5.4KViews0likes16CommentsRN3312 BTRFS operations are completely hung
We have owned the RN3312 for a bit over 6 months, and all was seemingly fine. However, things went downhill recently and now pretty much the entire BTRFS partition is completely unusable at this point. Even leaving the NAS offline and just trying to do whatever internal metadata cleanup by itself in a reasonable time is not enough to recover. What has happened is a combination of the Bit Rot Protection / COW + Compression + Snapshots being turned on, on a partition used for file backups, and image backups (Veeam) for a single, large, fileserver. BTRFS is NOT production ready for such a setup, I firmly believe this option should be removed from the UI, or a huge warning displayed. Everything was going great until the first snapshots needed to be deleted, where I ran into the problem of btrfs-cleaner taking up 100% CPU. Symptoms: the admin UI would lock up on any file operation in certain directories. Directory accesses would hang forever, even over SMB. Of course all the backups to the NAS were timing out. I eventually was able to delete the snapshots by hard rebooting the system and removing them before btrfs-cleaner got too bad. But now, I have the problem where btrfs-transacti is taking up 100% CPU. I have left the system sitting offline for a week just spinning at 100% CPU (!), and there is no visible improvement - EVERY BTRFS operation still hangs, no matter what I try. There is little disk activity, it is not thrashing - makes me think there is something wrong in the internals of BTRFS, or that the CPU is too underpowered to handle the amount of storage metadata operations. top - 12:30:29 up 1:39, 2 users, load average: 115.21, 112.48, 99.52 Tasks: 334 total, 2 running, 332 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.6 us, 23.0 sy, 0.0 ni, 72.1 id, 1.6 wa, 0.0 hi, 2.7 si, 0.0 st KiB Mem: 8113792 total, 2673896 used, 5439896 free, 4404 buffers KiB Swap: 2093052 total, 0 used, 2093052 free. 1980036 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3740 root 20 0 0 0 0 R 100.0 0.0 93:52.62 btrfs-tran+ 1 root 20 0 136632 6868 5144 S 0.0 0.1 0:02.45 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd admin@archive:/data$ iostat Linux 4.4.68.x86_64.1 (archive) 07/11/2017 _x86_64_ (4 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 0.55 0.00 25.73 1.56 0.00 72.15 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 3.26 55.82 57.19 337578 345828 sdb 3.25 55.07 57.08 333044 345196 sdc 3.45 74.32 57.02 449448 344808 sdd 3.28 53.95 57.21 326224 345936 sde 3.26 57.73 57.09 349104 345252 sdf 3.28 54.89 56.87 331951 343908 md0 1.68 27.57 39.28 166740 237520 md1 0.02 0.19 0.00 1172 0 md127 9.38 243.50 69.42 1472516 419788 < not a lot of activity... I have tried starting a balance to fix fragmentation, I believe there are operations blocking it inside the kernel, but even at -dusage=0 I gave up after giving it the weekend to do its thing. Trying to look for evidence is fragmented files is horrendously slow. But it is very bad now: admin@archive:/data$ ls *** hangs forever *** My hope at this point is to try and mount the system read-only and recover data onto a USB drive, the share with data is around 8 TB which might just fit after a couple of days/weeks? of copying... Then figuring out some way to drop the share? and rebuild it without selecting the 'Bit Rot Protection' or 'Compression' options. Hopefully I don't have to resort to copying the NAS to something else and wiping it - there is about 14 TB of data on it currently, and I don't have that much capacity available anywhere else... After going through this and after lots of research, I see lots of horror stories showing that BTRFS is extremely fragile and not ready for prime time. I believe it is reckless for Netgear to base a NAS on such an unproven FS. The features are not worth it if they explode in spectacular fashion after a couple of months. Symptoms include btrfs-transacti and btrfs-endio-wri taking up a lot of CPU time (in spikes, possibly triggered by syncs). You can use filefrag to locate heavily fragmented files (may not work correctly with compression). ... "a balance on 2TB of data that was heavily snapshotted - it took 3 months" "when I have to do balances ... I delete all the snapshots and allow a few months for the balance to finish" https://btrfs.wiki.kernel.org/index.php/Gotchas We are running version 6.7.4. We currently have 6 x 8 TB in X-RAID (certified drives.) I struggle to think what would happen if we filled up all 12 slots... Are there any other operations anyone from support wants to try before I start wiping it? Unfortunately our 90-day free support has expired before any of this happened, so I am left venting in public...4.8KViews0likes4CommentsRN214 btrfs corruption forced readonly
Hi I have an RN 214 that I could not write to any more. I have seen other have simular issues on the forum. Need suggestion how to proseed. The problem started earlier today. I updated the ReadyNAS OS to version 6.8.1 just a few days ago. Not jsure if it is related. I have enabled Bit-rot-protection on the share, I read that this may be a bad idea? Log where the problem started: Oct 10 06:01:23 nas32-2017 kernel: ------------[ cut here ]------------ Oct 10 06:01:23 nas32-2017 kernel: WARNING: CPU: 1 PID: 32658 at fs/btrfs/disk-io.c:541 btree_csum_one_bio+0x108/0x10c() Oct 10 06:01:23 nas32-2017 kernel: Modules linked in: vpd(PO) Oct 10 06:01:23 nas32-2017 kernel: CPU: 1 PID: 32658 Comm: kworker/u8:1 Tainted: P O 4.4.88.alpine.1 #1 Oct 10 06:01:23 nas32-2017 kernel: Hardware name: Annapurna Labs Alpine Oct 10 06:01:23 nas32-2017 kernel: Workqueue: btrfs-worker btrfs_worker_helper Oct 10 06:01:23 nas32-2017 kernel: [<c0015980>] (unwind_backtrace) from [<c0012388>] (show_stack+0x10/0x14) Oct 10 06:01:23 nas32-2017 kernel: ------------[ cut here ]------------ Oct 10 06:01:23 nas32-2017 kernel: WARNING: CPU: 3 PID: 32525 at fs/btrfs/disk-io.c:541 btree_csum_one_bio+0x108/0x10c() Oct 10 06:01:23 nas32-2017 kernel: Modules linked in: vpd(PO) Oct 10 06:01:23 nas32-2017 kernel: [<c0012388>] (show_stack) from [<c039bcd0>] (dump_stack+0x94/0xa8) Oct 10 06:01:23 nas32-2017 kernel: [<c039bcd0>] (dump_stack) from [<c0020074>] (warn_slowpath_common+0x84/0xb4) Oct 10 06:01:23 nas32-2017 kernel: [<c0020074>] (warn_slowpath_common) from [<c0020140>] (warn_slowpath_null+0x1c/0x24) Oct 10 06:01:23 nas32-2017 kernel: [<c0020140>] (warn_slowpath_null) from [<c02a82b4>] (btree_csum_one_bio+0x108/0x10c) Oct 10 06:01:23 nas32-2017 kernel: [<c02a82b4>] (btree_csum_one_bio) from [<c02a7170>] (run_one_async_start+0x34/0x44) Oct 10 06:01:23 nas32-2017 kernel: [<c02a7170>] (run_one_async_start) from [<c02e90fc>] (normal_work_helper+0x108/0x1f4) Oct 10 06:01:23 nas32-2017 kernel: [<c02e90fc>] (normal_work_helper) from [<c0035338>] (process_one_work+0x134/0x344) Oct 10 06:01:23 nas32-2017 kernel: [<c0035338>] (process_one_work) from [<c0035594>] (worker_thread+0x4c/0x4bc) Oct 10 06:01:23 nas32-2017 kernel: [<c0035594>] (worker_thread) from [<c003a858>] (kthread+0xfc/0x114) Oct 10 06:01:23 nas32-2017 kernel: [<c003a858>] (kthread) from [<c000f368>] (ret_from_fork+0x14/0x2c) Oct 10 06:01:23 nas32-2017 kernel: ---[ end trace 5d7482ac13d21d10 ]--- Oct 10 06:01:23 nas32-2017 kernel: CPU: 3 PID: 32525 Comm: kworker/u8:6 Tainted: P O 4.4.88.alpine.1 #1 Oct 10 06:01:23 nas32-2017 kernel: Hardware name: Annapurna Labs Alpine Oct 10 06:01:23 nas32-2017 kernel: Workqueue: btrfs-worker btrfs_worker_helper Oct 10 06:01:23 nas32-2017 kernel: [<c0015980>] (unwind_backtrace) from [<c0012388>] (show_stack+0x10/0x14) Oct 10 06:01:23 nas32-2017 kernel: [<c0012388>] (show_stack) from [<c039bcd0>] (dump_stack+0x94/0xa8) Oct 10 06:01:23 nas32-2017 kernel: [<c039bcd0>] (dump_stack) from [<c0020074>] (warn_slowpath_common+0x84/0xb4) Oct 10 06:01:23 nas32-2017 kernel: [<c0020074>] (warn_slowpath_common) from [<c0020140>] (warn_slowpath_null+0x1c/0x24) Oct 10 06:01:23 nas32-2017 kernel: [<c0020140>] (warn_slowpath_null) from [<c02a82b4>] (btree_csum_one_bio+0x108/0x10c) Oct 10 06:01:23 nas32-2017 kernel: [<c02a82b4>] (btree_csum_one_bio) from [<c02a7170>] (run_one_async_start+0x34/0x44) Oct 10 06:01:23 nas32-2017 kernel: [<c02a7170>] (run_one_async_start) from [<c02e90fc>] (normal_work_helper+0x108/0x1f4) Oct 10 06:01:23 nas32-2017 kernel: [<c02e90fc>] (normal_work_helper) from [<c0035338>] (process_one_work+0x134/0x344) Oct 10 06:01:23 nas32-2017 kernel: [<c0035338>] (process_one_work) from [<c0035594>] (worker_thread+0x4c/0x4bc) Oct 10 06:01:23 nas32-2017 kernel: [<c0035594>] (worker_thread) from [<c003a858>] (kthread+0xfc/0x114) Oct 10 06:01:23 nas32-2017 kernel: [<c003a858>] (kthread) from [<c000f368>] (ret_from_fork+0x14/0x2c) Oct 10 06:01:23 nas32-2017 kernel: ---[ end trace 5d7482ac13d21d11 ]--- .... Oct 10 06:01:24 nas32-2017 kernel: ------------[ cut here ]------------ Oct 10 06:01:24 nas32-2017 kernel: WARNING: CPU: 1 PID: 32660 at fs/btrfs/disk-io.c:541 btree_csum_one_bio+0x108/0x10c() Oct 10 06:01:24 nas32-2017 kernel: Modules linked in: vpd(PO) Oct 10 06:01:24 nas32-2017 kernel: CPU: 1 PID: 32660 Comm: kworker/u8:4 Tainted: P W O 4.4.88.alpine.1 #1 Oct 10 06:01:24 nas32-2017 kernel: Hardware name: Annapurna Labs Alpine Oct 10 06:01:24 nas32-2017 kernel: Workqueue: btrfs-worker btrfs_worker_helper Oct 10 06:01:24 nas32-2017 kernel: [<c0015980>] (unwind_backtrace) from [<c0012388>] (show_stack+0x10/0x14) Oct 10 06:01:24 nas32-2017 kernel: [<c0012388>] (show_stack) from [<c039bcd0>] (dump_stack+0x94/0xa8) Oct 10 06:01:24 nas32-2017 kernel: [<c039bcd0>] (dump_stack) from [<c0020074>] (warn_slowpath_common+0x84/0xb4) Oct 10 06:01:24 nas32-2017 kernel: [<c0020074>] (warn_slowpath_common) from [<c0020140>] (warn_slowpath_null+0x1c/0x24) Oct 10 06:01:24 nas32-2017 kernel: [<c0020140>] (warn_slowpath_null) from [<c02a82b4>] (btree_csum_one_bio+0x108/0x10c) Oct 10 06:01:24 nas32-2017 kernel: [<c02a82b4>] (btree_csum_one_bio) from [<c02a7170>] (run_one_async_start+0x34/0x44) Oct 10 06:01:24 nas32-2017 kernel: [<c02a7170>] (run_one_async_start) from [<c02e90fc>] (normal_work_helper+0x108/0x1f4) Oct 10 06:01:24 nas32-2017 kernel: [<c02e90fc>] (normal_work_helper) from [<c0035338>] (process_one_work+0x134/0x344) Oct 10 06:01:24 nas32-2017 kernel: [<c0035338>] (process_one_work) from [<c0035594>] (worker_thread+0x4c/0x4bc) Oct 10 06:01:24 nas32-2017 kernel: [<c0035594>] (worker_thread) from [<c003a858>] (kthread+0xfc/0x114) Oct 10 06:01:24 nas32-2017 kernel: [<c003a858>] (kthread) from [<c000f368>] (ret_from_fork+0x14/0x2c) Oct 10 06:01:24 nas32-2017 kernel: ---[ end trace 5d7482ac13d21d36 ]--- Oct 10 06:01:47 nas32-2017 kernel: BTRFS error (device md127): bad tree block start 11065534089998688569 17840787259392 Oct 10 06:01:47 nas32-2017 kernel: BTRFS error (device md127): bad tree block start 11065534089998688569 17840787259392 Oct 10 06:01:47 nas32-2017 kernel: BTRFS warning (device md127): Skipping commit of aborted transaction. Oct 10 06:01:47 nas32-2017 kernel: BTRFS: error (device md127) in cleanup_transaction:1856: errno=-5 IO failure Oct 10 06:01:47 nas32-2017 kernel: BTRFS info (device md127): forced readonly Oct 10 06:01:48 nas32-2017 kernel: BTRFS info (device md127): delayed_refs has NO entry I have tried to reboot the device and the device is still readonly. $ mount udev on /dev type devtmpfs (rw,noatime,nodiratime,size=10240k,nr_inodes=187826,mode=755) devpts on /dev/pts type devpts (rw,noatime,nodiratime,mode=600,ptmxmode=000) /dev/md0 on / type ext4 (rw,noatime,nodiratime,data=ordered) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime,nodiratime) proc on /proc type proc (rw,nosuid,nodev,noexec,noatime,nodiratime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755) tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,noatime,nodiratime,size=516372k) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,noatime,nodiratime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,noatime,nodiratime,cpu,cpuacct) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,noatime,nodiratime,cpuset) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,noatime,nodiratime,blkio) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,noatime,nodiratime,devices) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,noatime,nodiratime,freezer) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,noatime,nodiratime) sunrpc on /proc/fs/nfsd type nfsd (rw,noatime,nodiratime) mqueue on /dev/mqueue type mqueue (rw,noatime,nodiratime) configfs on /sys/kernel/config type configfs (rw,noatime,nodiratime) fusectl on /sys/fs/fuse/connections type fusectl (rw,noatime,nodiratime) /dev/md127 on /data type btrfs (ro,noatime,nodiratime,nodatasum,nospace_cache,clear_cache,subvolid=5,subvol=/) /dev/md127 on /apps type btrfs (ro,noatime,nodiratime,nodatasum,nospace_cache,clear_cache,subvolid=5,subvol=/.apps) /dev/md127 on /home type btrfs (ro,noatime,nodiratime,nodatasum,nospace_cache,clear_cache,subvolid=5,subvol=/home) /dev/md127 on /run/nfs4/data/Data type btrfs (ro,noatime,nodiratime,nodatasum,nospace_cache,clear_cache,subvolid=268,subvol=/Data) /dev/md127 on /run/nfs4/home type btrfs (ro,noatime,nodiratime,nodatasum,nospace_cache,clear_cache,subvolid=5,subvol=/home) btrfs filesystem show: # btrfs filesystem show Label: '0e8aece8:data' uuid: 1749e135-07f2-4052-a14a-44e5b2d0d3fc Total devices 1 FS bytes used 16.07TiB devid 1 size 21.82TiB used 16.10TiB path /dev/md127 btrfs check: # btrfs check -p /dev/md127 checksum verify failed on 17840787390464 found 7A43DE2D wanted 39312E37 checksum verify failed on 17840787390464 found 7A43DE2D wanted 39312E37 checksum verify failed on 17840787390464 found 2A8F23F6 wanted 38333831 checksum verify failed on 17840787390464 found 7A43DE2D wanted 39312E37 bytenr mismatch, want=17840787390464, have=3544948844530774573 Couldn't setup extent tree ERROR: cannot open file system Any suggestions? Thanks /bSolved4.1KViews0likes6CommentsWhat btrfs data compression format is used in NAS OS 6?
I was looking at the options for btrfs, and I can't remember what Netgear mentioning what type of data compression is used in their boxes. I've heard of both zlib (faster compresion) and LZO (more compression) being used in btrfs systems. I'm not as interested in the space-saving end as I am in the transfer speed of otherwise non-compressed data (not precompressed AV media for example). What does OS 6 use?3.2KViews0likes3Comments