Reply

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

joki
Tutor

btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Since I upgraded my RN102 to 6.4.0 last night I frequently have trouble reaching the admin interface, and my laptop has reported not being able to reach Time Machine.

 

When I login by ssh and use "top", I can see that btrfs-cleaner is stuck at 100% CPU. This can last up to 10 minutes at a time. I only installed about 12h ago and the btrfs-cleaner process already has over 85 minutes of CPU time, that's more than 20 times more than the second most long-running process, readynasd which is at 4 minutes.

 

After a while, btrfs-cleaner drops back to 0% CPU and I can access the NAS normally.

 

My question: is it normal to have such high load of btrfs-cleaner so frequently? Is there any configuration setting that influences this behaviour. I never noticed this problem under the previous version, 6.2.5.

 

Cheers,

Joachim

 

 

In case it matters, my NAS is equipped with 2x 4TB drives in RAID0 with about 40% in use.

 

Message 1 of 32

Accepted Solutions

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

It seems like I found a way to get rid of the hanging btrfs-cleaner: After switching off quota

btrfs quota disable /data

btrfs-cleaner did never show up again (in top).

(After booting I usualy had about a minute to issue the command before btrfs-cleaner did spawn and prevented the command to respond.)

 

I did not intentionally enable quota - i guess it was enabled by default.

 

Good luck,

   HansW

 

PS: Maybe the problem is known by Netgear - in their "FAQs on recently released firmware 6.4.0" (see link on top of page) they talk about "... this is likely due to the ReadyNAS performing a quota check".

Some websites also describe problems with quota and btrfs (Rockstor forum), other sites cite the quota support in btrfs as experimental (but these might be out of date).

I did not extensive testing after diabling quota that way - I'm happy that I can acces my files again ...

View solution in original post

Message 16 of 32

All Replies
travmed
Aspirant

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

I'm seeing the same exact problem. The admin interface is worthless and times out when that service is running. I was copying files via the admin interface and they froze for 30 minutes before I cancelled the copy. This all started with the 6.4.0 update.

Message 2 of 32
AlexPe
NETGEAR Moderator

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Hi Joachim,

 

Did you recently delete a very large file or a lot of files? Has there been any delete request like this or similar sent to the volume? 

 

btrfs-cleaner is used when there is a large file(s) deletion.

 

Alex

 

 

Message 3 of 32
kohdee
NETGEAR Moderator

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Did it remove a bunch of your old snapshots? If you have snapshots and Smart Snapshot Management cleans up these snapshots, btrfs-cleaner will be busy wiping away those snapshots. In 6.4.0, we upgrade to the latest version of snapshots (which is why you can't downgrade from 6.4.0), so this could also be related. 

Message 4 of 32
Wheeldog
Aspirant

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

This has started happening with my RN316, also after the 6.4.0 upgrade. In my case, the activity seems to be triggered by any attempt to use the Plex server app, although that could also be coincidental. Logging in with ssh, top shows btrfs-cleaner maxed out at 99/100%, starving any other processes of CPU time. Admin page becomes unresponsive, and attempts to contact the plex app stall. A few times it has died down by itself after 5/10 minutes or so, twice it's gone on for a lot longer and I've given up and rebooted. It has happened at various times of the day, late evening, now happening before lunch.

 

Even if the process is going through and cleaning up after snapshots, should it really be able to max out the CPU, and prevent other processes from running?

 

Cheers,

Andrew

Message 5 of 32
UnitService
Tutor

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Hi,

 

i disabled dropbox backup and now btrfs-cleaner use 10/15% of cpu. I do not know if there is a relationship between the two but now the NAS working

Message 6 of 32
spotcatbug
Apprentice

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

 

I'm in the same boat. Just want to chime-in, lest anybody think this is a rare problem with 6.4.0.

 

Just now, I've had a browser window open, trying to connect to the admin pages (all I've seen for the past 90 minutes is the splash screen, after authenticating). Simultaneously, I've had a ssh terminal window open, running top on the NAS. At the exact moment that the btrfs-cleaner process went to 0% CPU, the admin page finally popped-up.

 

Message 7 of 32
joki
Tutor

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Hi Alex,

 

most of the data traffic on my RN102 is from TimeMachine on a couple of macs, but the load hasn't changed since before the update.

 

I'm still getting very high load from btrfs-cleaner several times a day, so much so that TimeMachine gives an error message saying the backup disk isn't available. The CPU time after 6 days is now up to a record 1270 minutes for btrfs-cleaner with readynasd at a mere 36 minutes, according to "top".

 

Joachim

 

Message 8 of 32
joki
Tutor

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Is there any way to turn on logging for btrfs-cleaner to find out what it's up to? As it is, I don't know why it seems to be taking longer than before since my usage profile hasn't changed that much since upgrading...

 

Message 9 of 32
MobRules
Aspirant

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Same problem here, actually the device froze and I had to pull the plug. Nothing done lately that could cause this, just the upgrade, so perhaps listen to the guys in the posts and see what has changed in this area in your 6.4.0 release? 

Message 10 of 32

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Same here with me. System: ReadyNAS 102 It might be the first reboot after the update. Uptime 2:08 hrs, load average: 8.45, 9.00, 9.97 Top shows "120:18.59 btrfs-cleaner" - quite a long time for a single process. Hoping it's not "cleaning" all my 2x3TB HDDs ;-) Admin page keeps reloading (constantly posting to "ddbroker") Copying to the NAS ("cp" over NFS4) is not returning (canelled). Directory listing in several directories either (not over NFS, not locally). The syslogs showing quite often "readynasd[3359]: thread create fail, errno=12" I will leave it running, checking tomorrow whether "btrfs-cleaner" calmed down.
Message 11 of 32
joki
Tutor

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Just wanted to post an update to emphasise that this problem is in fact still occurring. Many times a day my NAS becomes unresponsive due to extremely high CPU usage of btrfs-cleaner.

As shown by top, btrfs-cleaner has been running for 2669 minutes (nearly 2 entire days) in a total uptime of 11 days, while readynasd has only used 75 minutes of CPU time:

top - 17:04:15 up 11 days, 18:01,  1 user,  load average: 0.67, 0.65, 1.02
Tasks: 140 total,   1 running, 139 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  4.6 sy,  0.0 ni, 51.1 id, 42.6 wa,  0.0 hi,  1.0 si,  0.0 st
KiB Mem:    508804 total,   483508 used,    25296 free,     3424 buffers
KiB Swap:   523708 total,       16 used,   523692 free,   369660 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND           
 2583 root      30  10     0    0    0 S   0.0  0.0   2669:52 btrfs-cleaner     
 2905 root      19  -1  186m  29m 8640 S   0.3  6.0  75:54.44 readynasd         
 2624 messageb  20   0  4020 1356  900 S   0.0  0.3  19:35.37 dbus-daemon       
  404 root      20   0     0    0    0 S   0.3  0.0   6:48.03 kswapd0           
    7 root      20   0     0    0    0 S   0.0  0.0   5:51.99 rcu_sched

This problem did not exist before upgrading to 6.4.0 from 6.2.5. Since the upgrade, there have been no unusual activities such as deleting large amounts of files or data or removing snapshots.

There has been no response from Netgear so far regarding how to enable logging to find out why btrfs-cleaner is taking so much time or which features to disable in order to work around the problem.

Message 12 of 32
jameswalmsley1
Aspirant

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

I have noticed this, it happens if I manually delete a very large snapshot.

 

James

 

Message 13 of 32

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Short after the las reboot (the NAS was not even responding to ssh) the btrfs-cleaner restarted - taking up to 97% of CPU.

Acces to the files (via NFS) is not possible any more, even a ls never returns.

It might be related to the deletion of a snapshot (the ReadyNas-Logs were showing an according message after the reboot), but it should not be that it leaves the NAS unusable.

It never happened befor firmware 6.4.0, Usage of the NAS did not change in any way.

 

Unfortunately there seems to be no documentation for btrfs-cleaner and no way to find out what it's really doing ...

 

Hello, Netgear *knock* *knock* *knock* - any comments on this?

 

Regards, HansW

Message 14 of 32

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Well - I'm not too sure but it seems that any file-access keeps hanging and "deadlocks" with btrfs-cleaner.

The NAS did not even shutdown completely: After shutdown -r it stopped responding to ping very quickly but never switched off - and did not respond to the power-button any more.

The only way to reboot was to disconnect it from power ... Smiley Sad

Message 15 of 32

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

It seems like I found a way to get rid of the hanging btrfs-cleaner: After switching off quota

btrfs quota disable /data

btrfs-cleaner did never show up again (in top).

(After booting I usualy had about a minute to issue the command before btrfs-cleaner did spawn and prevented the command to respond.)

 

I did not intentionally enable quota - i guess it was enabled by default.

 

Good luck,

   HansW

 

PS: Maybe the problem is known by Netgear - in their "FAQs on recently released firmware 6.4.0" (see link on top of page) they talk about "... this is likely due to the ReadyNAS performing a quota check".

Some websites also describe problems with quota and btrfs (Rockstor forum), other sites cite the quota support in btrfs as experimental (but these might be out of date).

I did not extensive testing after diabling quota that way - I'm happy that I can acces my files again ...

View solution in original post

Message 16 of 32
joki
Tutor

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade


@HansWeihnacht wrote:

It seems like I found a way to get rid of the hanging btrfs-cleaner: After switching off quota

btrfs quota disable /data

 

 

Thanks a lot Hans, this sounds like the first sensible, workable solution in this thread. I don't have access to my NAS until November, but this is the first thing I'll try when I do.

 

Joachim

 

Message 17 of 32

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

I have no idea what side-effects it may bring - I'm not too deep into the details of BTRFS.

What I noticed so far:

* Read- and write-acces as before

* Snapshot creation and deletion works as before (according to ReadyNAS-Logs)

* btrfs-cleaner never used noticeable CPU since (logging in intervals of 15min)

* ReadyNAS Admin-Page shows data-usage only "Snapshots" and "Free" (no "Data" any more, but I'm not sure wheter that's related to switching off quota - might be, because tha data on disk tends to be "old")

 

Message 18 of 32

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

@HansWeihnacht You sir, are a legend.

 

For the first time in 2 weeks, something of use to my conundrum...

Found btrfs-cleaner running at 97% CPU usage and disabled the quota as per your post.

The process stopped, and slowly things seem to have normalised - LAN access, web interface etc.

 

I also noticed the data usage as "Free" and "Snapshots" - this worries me, because if quota somehow becomes enabled again and it sees the data volume as snapshots, who knows what it may do?

As it stands I am going to run with this for now, and hope that Netgear release a patch for all of this mess soon, so that we don't have to dabble around in SSH to blindly try fix issues which should not be present in a final firmware release...

 

Thank you again!

Message 19 of 32
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Quotas are used to determine how much space is used by snapshots. The system does not need quotas to know what is a snapshot and what isn't.

 

If you haven't been running volume maintenance I would suggest you run it and once it is complete turn quotas back on.

 

As you have seen some code is not designed to handle btrfs quota being disabled as we don't support doing that.

Message 20 of 32
AlexPe
NETGEAR Moderator

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Couple things to consider,

 

1. If you do not care about share quotas you can leave it disabled, if you care about the metrics you can reenable them and qgroups will begin rescanning and correcting the metrics in the interface.

 

2. Please, Please, Please, while you have the resources on your systems run the Scheduled Maintenance. This will benefit the FS in the future.

 

Alex

Message 21 of 32
mdgm-ntgr
NETGEAR Employee Retired

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

An issue has been identified that may explain the slowdown.

 

So I would probably leave quotas disabled for now (if you have disabled them) and then provide feedback if the next beta build (once available) resolves the problem.

If you have disabled quotas please try to avoid adding shares or changing the quota setting on existing shares.

Message 22 of 32
metapaso
Apprentice

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Hi,

 

I have this problem on a 316 with 3x6TB drives.  Every once in awhile the shares and admin panel are inaccessible and when I SSH to the machine I can see btrfs-cleaner pegged near 100%.  I renice'd the process to a niceness of 10, but it doesn't really seem to help.

 

The thing is, I don't have any quotas enabled at all.  I just have a handful of shares and some of them have the hourly snapshot feature turned on.  This seems to be in relation to cleaning up snapshots?

 

 I'm not adding or deleting many files.  One snapshotted subvolume (share) is about 6TB in total, but changes very little, maybe 30-50GB a week.

 

Damon

Message 23 of 32
StephenB
Guru

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Try 6.4.1 beta, it has a fix for your issue.

Message 24 of 32

Re: btrfs-cleaner frequently stuck at 100% CPU after 6.4.0 upgrade

Well, I'm not to keen on testing Beta-software on my data ...

 

I uploaded the img-file, rebooted ... and did not look at top :-(

After enbling quota i had a look at top and notticed up to three kworker processes using cpu at 60-90%, with wait at about 5, load netween 1.3 and 3.5.

 

I just wanted to check, what happens when I disable quota again - wich was not a so-good idea.

since then, the NAS is unreachable again (no ssh, no admin-page).

 

I'll leave it like that for the moment, hoping it cools down again (not beeing too optimistic).

 

PS: pings are answered - but I imagine, that's the NIC on its own.

Message 25 of 32
Top Contributors
Discussion stats
Announcements