× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Re: btrfs locking at 74% on 6.1.6

LogicX
Aspirant

btrfs locking at 74% on 6.1.6

I'm running 6.1.6 final - I STARTED with a fresh install and rebuild on a 6.1.6RC
I had all my data offloaded; I started copying it back.
I've since updated to 6.1.6 final.
I am now copying it back.
As I got into the 70% range, I started to have performance problems.

I would see some btrfs processes spiking CPU; and top reports most CPU usage going to 'wait' times.
Load increases to 30+ and the system becomes unresponsive. Data copies to the system crawl to a halt.
I stop transfers and the system eventually recovers (sometimes I have to reboot it).
So long as I don't add more data; it works great -- but all efforts to add additional data seems to create problems.

I'm quite sure my btrfs block size should be the new; larger size -- but is there a command I can run to confirm?
If I'm running the larger block size; it would appear there's still a major problem here!

Open to suggestions of what to do!
I'd really like to use the last 3TB of my NAS!
/dev/md127 11T 8.1T 2.9T 74% /data
Message 1 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs locking at 74% on 6.1.6

Download the logs.

In btrfs.log what do you see for the nodesize and the leafsize for /dev/md127 ?

Can you post the contents of your initrd.log ?
Message 2 of 18
LogicX
Aspirant

Re: btrfs locking at 74% on 6.1.6

nodesize 16384
leafsize 16384

Full content of initird.log is 3 lines:
[2014/02/02 22:37:04] Factory default initiated by Frontview!
[2014/02/02 22:37:35] Defaulting to X-RAID2 mode, RAID level 5
[2014/02/02 22:37:58] Factory default initiated on ReadyNASOS 6.1.6.
Message 3 of 18
Piglet
Luminary

Re: btrfs locking at 74% on 6.1.6

Interesting. I did a factory reset on my 104 with 6.1.6 the other day and my nodesize and leafsize are both 32768 ...
Message 4 of 18
vandermerwe
Master

Re: btrfs locking at 74% on 6.1.6

LogicX
Are you running 6.1.6 on an Ultra (this is in your profile) ?
Message 5 of 18
LogicX
Aspirant

Re: btrfs locking at 74% on 6.1.6

True. Don't hate me for running 6.1.6 on a RNDU6000 :oops:

vandermerwe wrote:
LogicX
Are you running 6.1.6 on an Ultra (this is in your profile) ?
Message 6 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs locking at 74% on 6.1.6

Are you sure that's the leafsize for /dev/md127 not /dev/md0. Can you PM the full contents of btrfs.log

In the logs what does os_version.log contain?

From your initrd.log it looks like you are still running a RC?

What services are you using? Are you using iSCSI?

Do you have any apps installed?
Message 7 of 18
LogicX
Aspirant

Re: btrfs locking at 74% on 6.1.6

My apologies! Yes, I quoted /dev/md0 by mistake.

/dev/md127 says:
nodesize 32768
leafsize 32768

os_version.log: ReadyNASOS!!version=6.1.6,time=1390873255,arch=x86_64

I'm quite sure I upgraded to 6.1.6 final... but open to trying again if that does not appear to be the case.

no iSCSI

I have plex app installed

I use SMB, FTP, AFP, DLNA

I've managed to get to 80% usage; but my CPU usage spiked so high and never returned for hours -- I couldn't execute any commands and had to hard restart.

/dev/md127 11T 8.7T 2.3T 80% /data

mdgm wrote:
Are you sure that's the leafsize for /dev/md127 not /dev/md0. Can you PM the full contents of btrfs.log

In the logs what does os_version.log contain?

From your initrd.log it looks like you are still running a RC?

What services are you using? Are you using iSCSI?

Do you have any apps installed?
Message 8 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs locking at 74% on 6.1.6

You are running RC13.

If you were running the final release os_version.log would look like this:

ReadyNASOS!!version=6.1.6,time=1391111179,arch=x86_64


When you run top do any of your services in particular (other than btrfs ones) stand out as using lots of CPU?
Message 9 of 18
LogicX
Aspirant

Re: btrfs locking at 74% on 6.1.6

I can upgrade to final (but I'll wait to hear from you that I should since we're in the middle of troubleshooting)

Plex is often running and scanning -- but nothing too crazy.

I was just able to get it to almost lock up with a huge load due to lftp mirroring a directory to the server --
transfer started at 60MB/sec -- after a few seconds it slowed to kbs (halted, really) -- and I was SSH'd in -- wouldn't even execute my 'w' command to see the load. 😕

After about 10 minutes the load finally dropped below 20 and it executed
I ran top to see a process md127_raid5 using a lot of CPU

Repeating the transfer with top running shows the following competing for CPU:
proftpd
md127_raid5
flush-btrfs-2
kswapd0

Load is holding around 5

Transfer going 30-35MB/sec

And now after a few minutes it locked up again.
Here's what top looks like: https://www.dropbox.com/s/08bus9zeiss1lds/Screenshot%202014-02-17%2001.30.32.png
Note the 99.5% wait
Then it finally refreshed 1 time a few minutes later:
https://www.dropbox.com/s/30r4ubjlxp9abzf/Screenshot%202014-02-17%2001.32.14.png

Many more minutes later as it finally recovered, it looked like:
https://www.dropbox.com/s/k9u5xsgy65rr637/Screenshot%202014-02-17%2001.35.14.png
(note the drastically reduced wait time, sky-high load which is dropping)
Message 10 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs locking at 74% on 6.1.6

I would suggest updating to the final release. Shouldn't be many changes but better to troubleshoot on that than an old beta.
Message 11 of 18
LogicX
Aspirant

Re: btrfs locking at 74% on 6.1.6

Well that's embarrassing.... I updated again to 6.1.6 final, and then when I went to look at the logs, realized I'd been reading out of an old download of the logs before.

I definitely was already on the latest:
ReadyNASOS!!version=6.1.6,time=1391111179,arch=x86_64

[2014/02/02 22:37:04] Factory default initiated by Frontview!
[2014/02/02 22:37:35] Defaulting to X-RAID2 mode, RAID level 5
[2014/02/02 22:37:58] Factory default initiated on ReadyNASOS 6.1.6.
[2014/02/09 00:03:34] Updated from ReadyNASOS 6.1.6 to 6.1.6.
[2014/02/17 00:23:18] Updated from ReadyNASOS 6.1.6 to 6.1.6.

Was able to immediately get it to do it again:
https://www.dropbox.com/s/ztvaeatfmrhngvk/Screenshot%202014-02-17%2002.38.10.png
Message 12 of 18
JMehring
Apprentice

Re: btrfs locking at 74% on 6.1.6

I have been noticing similiar behavious myself where my load will spike up to 30 on 6.1.6, but since I never did a factory reset yet I have just begun the process of debugging what is wrong although I think my problem is related to installed apps though. I am impressed that my NAS is not crashing though.

Have you tried turning off all addons while copying?

One thing I noticed in regards to your top imagaes you provided is that Plex Media Server is running when you have these problems. I have also noticed my Plex Media Server seems to be high in the top list when I get these spikes. When I turn off addons like plexmediaserver and nzbget my load immediately goes down.

I switched to nzbget since I thought sabnzb was taking too much processing time when idle but it may also be causing problems. I also switched from using deluged to utorrent and utorrent is way better on my system load. I also created custom init.d scripts for alot of my apps that set ionice, nice and io scheduler class. For me I have narrowed down my misbehaving apps to either plexmediaserver, nzbget or bitcasa.
Message 13 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs locking at 74% on 6.1.6

What version of Plex Media Server do you have installed?
Message 14 of 18
JMehring
Apprentice

Re: btrfs locking at 74% on 6.1.6

I have version 0.9.9.2.374-aa23a69.

I think my problem is with nzbget now though. 98-99% of the time my load is fine. It just seems to be shooting up to a load of 20-30 every once in a while. Last time I just shut off nzbget and all went back to normal.

I wish there was a was to run iotop on this kernel. Maybe I will have to compile a version of the kernel myself 🙂


# iotop
Could not run iotop as some of the requirements are not met:
- Linux >= 2.6.20 with
- I/O accounting support (CONFIG_TASKSTATS, CONFIG_TASK_DELAY_ACCT, CONFIG_TASK_IO_ACCOUNTING)
Message 15 of 18
LogicX
Aspirant

Re: btrfs locking at 74% on 6.1.6

I'm running plex version="0.9.7.28.31-d1f35c1"

Interesting -- I shut down plex and have been transferring data for 20 minutes or so without a lockup.

Makes no sense to me -- especially as I'm not even transferring into a folder that plex monitors.

I'll continue transferring for a while, and then try turning plex back on and see if it locks up right away.
Message 16 of 18
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs locking at 74% on 6.1.6

LogicX your Plex version is quite old. Might want to try a newer version.
Message 17 of 18
LogicX
Aspirant

Re: btrfs locking at 74% on 6.1.6

Working on upgrading to a new version -- does look like mine is 9 months old or so.

I noticed the following in dmesg output:

INFO: task btrfs-transacti:2021 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
btrfs-transacti D 0000000100098c98 0 2021 2 0x00000000
ffff88001ba9dd20 0000000000000046 ffff880000000000 ffff88001ba9dfd8
ffff88001ba9dfd8 0000000000004000 ffff88003d46d910 ffff880039112990
0000000100000000 0000000000000001 00000000ffffffff ffff88003d402300
Call Trace:
[<ffffffff880a3e35>] ? __free_pages+0x25/0x30
[<ffffffff880d0052>] ? __free_slab+0xc2/0x190
[<ffffffff880ce984>] ? add_partial+0x24/0x80
[<ffffffff88879c6a>] schedule+0x3a/0x50
[<ffffffff8887a0b9>] schedule_timeout+0x1a9/0x210
[<ffffffff88063929>] ? prepare_to_wait+0x59/0x90
[<ffffffff882ddabb>] btrfs_commit_transaction+0x20b/0xa90
[<ffffffff88063b30>] ? abort_exclusive_wait+0xb0/0xb0
[<ffffffff882de9a5>] ? start_transaction+0x95/0x3f0
[<ffffffff882d5afd>] transaction_kthread+0x23d/0x260
[<ffffffff882d58c0>] ? btrfs_alloc_root+0x40/0x40
[<ffffffff88063339>] kthread+0x89/0x90
[<ffffffff8887cf34>] kernel_thread_helper+0x4/0x10
[<ffffffff880632b0>] ? kthread_worker_fn+0x140/0x140
[<ffffffff8887cf30>] ? gs_change+0xb/0xb

and I have a number of them:

root@Cabana:/var/log# dmesg |grep hung
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
root@Cabana:/var/log#
Message 18 of 18
Top Contributors
Discussion stats
  • 17 replies
  • 2350 views
  • 0 kudos
  • 5 in conversation
Announcements