- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
BTRFS scrub speed is insanely slow.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
BTRFS scrub speed is insanely slow.
So I have a RN104, with about 10TB of disk space. When I set it to do a BTRFS scrub, the speed is horrific. I'm getting something like 30GB/hour of scrub speed. At the present rate, it would take almost a month of continuous disk griding to scrub all my data. The BTRFS forums routinely show people having scrub speeds that are slow, but nothing like this. People are advised to scrub roughly monthly, which would never work if the scrub itself takes a whole month.
The thing is, it didn't used to be this bad. I previously scrubbed my data with an earlier version of the firmware (6.4.x? not sure, don't remember), and it performance was fine. It would finish in a day or two, which is totally reasonable. Somehow, recent firmware totally broke this functionality.
Defrag and balance are still about the same, completing in 1-2 days, but scrub has just putrid performance.
Also, I'll point out that I can manually scrub the whole thing, by simply reading every byte of every file using something like "find ./ -exec sha1sum {} \;", run from a recent snapshot. This should be doing (almost) everything the scrub is doing, but it runs at approximately 20-100x the speed, finishing in a day or less.
There is a serious bug here, have others seen it?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Replying to my own message, it's about 4GB/hour, not 30GB/hour. I let it run for roughly 3 days (~75 hours), and it finished a bit over 300 gb total.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
The last one I did took about 30 hours - that was on an RN526 with 4x6TB installed. That was using 6.6.1 beta.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
That seems like a roughly plausible speed for this to be.
So, given that i have half as much data as you do, but my scrub is projected to take roughly 2,000 hours, is it really plausible that the CPU in the RN104 is 2,000 / 15 ~ 130x slower than the one in the RN526? Doesn't seem right to me.
There's definitely a bug here.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Well one would need more info such as what services you are running, if you have any apps installed, disk SMART stats, logs zip from your system etc. before rushing to any judgments.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Nothing is running, and there's nothing wrong with the disks. I you have some ideas for logs that would be useful, I'm perfectly willing to post them up here. But all the standard culprits (AV, disk spin down, other things running, etc...) are not a factor. They are either totally off (AV, spindown), or really minimal (1 connected user, approx 0.0 kbps of activity throughout).
Also, this same thing happens if I run btrfs scrub start /data, straight from the commandline. So it's not completely limited to the readyNAS software itself, probably just part of a larger pathology within the Linux kernel. But it wasn't always this bad, it got about 10-100x worse about a year ago, presumably with one of the OS updates.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Send in the full set of logs anyway (see the Sending Logs link in my sig)
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Logs are sent in.
Just a quick recap, awhile back I was able to sha1sum the entire contents of the file system in about a day or two (forget exactly how long), and my data hasn't grown much since then (perhaps ~30% or so). So I don't think the issue is anything related to hardware issues (otherwise, why would this manual scan through the data be so much faster) or deep seated software issues. I'm suspecting a minor performance bug, perhaps the scrub is splitting the IOs into too small of batches?
It's been through a lot of defrags and balances, typically doing one of each every month, and I've tried to scrub it several times, but always gave up after a few days showed minimal progress. So probably not something as simple as fragmented files.
Anyway, if you do get to the bottom of this, that would be fantastic.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
The firmware update history is in initrd.log.
Do you recall when your scrubs started getting slow?
You do have a multi layer array so when the RAID scrubbing will do one layer then move onto the next.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
I do not have this issue but it is slow!
4Tb scrubed in 12Hrs.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Scrubbing 4TB in 12 hours is about what I would expect to see. That's about 100 MB/sec, which is reasonably close to the rated performance of a typical hard-drive. I'm getting a much slower speed, by more than an order of magnitude, 4TB in 10-20 days. Something like 2-5 MB/sec, which is way too slow.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
How did you configure your device?
What Raid type and how many arrays?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Sorry for the delay. I did some background research.
My drive setup was unchanged from the time I initially set up the readyNAS. Also, right from the start, I put several TB of data on to the readyNAS. It has grown slightly, but I would estimate that at the very beginning this thing had about 2TB of data on it, and now it has more like 4TB, so maybe 2x growth since inception.
2015-07-04: Running firmware 6.2.4
Scrub completed in 24 hours (this is expected performance level, result is good...)
2015-12-20: Upgrade to 6.4.1
ReadyNAS started having kernel panics, which would result in a disk check on startup, then another kernel panic, etc...
Prior to 6.4.1, never had an issue. Opened a support case, was eventually resolved with...
2015-12-29: Upgraded to 6.4.2-T59
This was a beta release that fixed the kernel panic, I think it was actually the RC for 6.4.2 and eventually became 6.4.2
Kernel Panic problem is indeed solved.
2016-01-04: Runnign version 6.4.2-T59
Started scrub, ran for more than 30 hours and did not complete. Eventually stopped it and rebooted the readyNAS
Here is what I wrote to the readyNAS customer service at the time...
"Today I needed to restart my readyNAS. It started scrubbing the disk (which is normal, I have scheduled it to happen every few months), and then became very slow and unresponsive. I could still ping it, and get into the admin interface, but the drive itself was effectively hung."
2016-02-07: Running Firmware 6.4.2
Started scrub, ran for more than 2 days, made minimial progress, and I stopped it.
Here is what I wrote to tech support at the time:
"The new firmware 6.4.2 is working much better. The only thing I would note is that the disk scrubbing now seems to be causing extremely poor performance on the unit while it is underway. I have 8 TB of space on this unit, with only 2 of them used, and the scrub finished about 3% in more than 30 hours, indicating that it would take something like a full month to do the whole thing. That doesn't seem right to me. Also, while the scrub is ongoing, the unit becomes nearly unresponsive, repeatedly disconnecting, and taking ages to even bring up the management page."
So I think it's fair to say the problem started sometime between 6.2.4 and 6.4.2. Sorry I couldn't narrow it down any further.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
It's using X-RAID, with 4 drives. 3x3TB and 1x2TB.
However, there was a time when this worked (firmware 6.2.4), so I think this is definitely a software issue. I already went through the whole dog an pony show of...
Them: "Turn off the AV"
Me: "I have never used AV in my life"
Them: "Turn off the disk spindown"
Me: "I have never used disk spindown in my life"
....
It's not a hardware or configuration issue, since those didn't change between the time that the scrub worked, and the time that it stopped working. The only thing that changed is the software, going from 6.2.4 to 6.4.2
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
At this level I fell a complete jackass and the best way to solve that issue is ask NG nerds!!!
If it is possible to empty your data elsewhere and rebuild completely your arrays and config I think it will solve a bit the issue.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
What are your HDD type?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
I already sent in all the logs, so you can find that information there.
Short answer, I'm not running anything, have no apps installed, etc... so it doesn't matter.
Try it yourself, get a RN104, put in 3x3TB drives, 1x2TB drive in X-RAID, put 2TB of data on it, run version 6.2.4 and do a scrub. Then, upgrade to 6.4.2 and do another scrub. See if your numbers differ by at least 10x from before and after. I'm betting that they will.
In any case, if it were the case that some minor activity on the readyNAS inflated the scrub time by 40x, that would in and of itself be a bug anyway.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Disk1: WD20EARX (2TB)
Disk2-4: ST3000DM001 (3TB)
However, note that these are unchanged from the time when scrub worked to the time when scrub did not. If the disks were the problem, I would have expected them to also be a problem when running version 6.2.4.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
This isn't really possible. I don't have an extra 4-5 TB NAS lying around to use for such things.
I'm actually doubting that this would help all that much anyway. This problem has persisted through multiple defrags and rebalances. If it was anything that could be fixed by just rarranging data on the disks, it probably would have been fixed by now.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
I have a RN316 with 14TB :)!
I do have 3 Raid 1 arrays!
I am sorry I am reaching my limits. Try a wiser guy or NG nerds upstairs! :(!
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Quite a collection.
Thanks for your help. I think this is going to need someone from netgear to debug it. I'm a fairly astute engineer myself, and I'm fairly convinced it's a software problem. Nothing else makes any sense. For instance, I can actually read the 2TB contents of the disks off, and hash all the data, in much less time (~5%) than it takes to scrub. So any sort of hardware issue, or just generally bad performance of the unit would have showed up there. Similarly defrags and rebalances work fine, and scrub did work fine in the past, before a firmware upgrade, but now they don't. If the difference was 2x or so, I'd just write it off, but it's more like 40x, way too much to be explained by anything other than a bug.
Everything points to some sort of horrible performance bug introduced between 6.2.4 and 6.4.2 that badly broke scrubs on the RN104, at least in some configurations. Presumably this would have also gotten the RN204 and RN214, since they aren't that different in terms of hardware. But again, it may need a specific layout in order to trigger...
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Good luck with your issue!
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
The scrub itself is done by btrfs. There's been a lot of changes since 6.4.2, but I'm not seeing anything specific on scrubs. Still, it's possible that 6.6.1 might have better performance.
Did you make any changes to the volume settings (in particular checksums)? Similarly, did you change the bitrot or compression settings for any shares?
If you have ssh enabled, you could also try running top (w/o the scrub running) to see if something has loaded down the system.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
I've done all the obvious things.
1) I am running on 6.6.1, it is not any better than 6.4.2 in this regard. Nor has any version since 6.4.2 been any better, I check each one when it comes out.
2) I made no changes.
3) I have checked, TOP says nothing is running, and when I do the scrub, top says it's using essentially 100% CPU, mostly system time.
4) I've tried starting a scrub manually with "btrfs scrub start /data", and the result is the same as when I start it using the readyNAS tools.
5) I've gone through the various Linux forums, nothing quite like this appears there.
Now, I know that version 6.4.1 had a lot of issues with the hardware RAID controller, and that this was resolved in 6.4.2. I'm suspecting that some part of this resolution (or associated workd) made the scrub logic really, really not play nicely with the RAID acceleration hardware that these systems have. This is just a guess, but it's way more likely than anything else I've heard so far.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: BTRFS scrub speed is insanely slow.
Try to disengage some features such as SMB or the others and re-run scrubs and balance to ease up a bit the CPU!