NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
joey123
Jan 29, 2017Tutor
BTRFS scrub speed is insanely slow.
So I have a RN104, with about 10TB of disk space. When I set it to do a BTRFS scrub, the speed is horrific. I'm getting something like 30GB/hour of scrub speed. At the present rate, it would take ...
Leia
Feb 10, 2017NETGEAR Employee Retired
StephenB wrote:
TeknoJnky wrote:
Also, please understand that scrubbing checks through every sector of data every time, while defrag and balance only touch the files/extents that need to be, which is why they can be much faster.
The RAID scrub does that for sure.
I don't think BTRFS scrubs check the free space.
It's correct that kohdee said "When you schedule/run SCRUB from the UI, it runs both a scrub on the RAID as well as a btrfs scrub" if the box enabled data checksum(admin UI System -> Volumes -> Settings page). If you disabled the checksum, it only runs a scrub on the RAID.
BTRFS scrub just check the data space haven't checked from last scrub. It means if no data changed from last scrub done, the BTRFS scrub would be finished just in seconds.
RAID scrub always resync the whole data volume again.
They start at the same time, run at the same time.
joey123 enabled the checksum. So if you don't want to replace the WD drive at once, you can also disable the checksum, and run scrub again, and check the resync speed by below command. It's also a RN104 running 6.6.1 with 2* 6TB WD Red drives and 2* 3TB Toshiba drives.
root@nas-35-66-48:~# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md126 : active raid1 sdb4[1] sdd4[0]
2930126912 blocks super 1.2 [2/2] [UU]
md127 : active raid5 sda3[0] sdd3[3] sdc3[2] sdb3[1]
8776244352 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
[===>.................] resync = 18.4% (541083168/2925414784) finish=2471.5min speed=16078K/sec
md1 : active raid6 sdd2[3] sdc2[2] sdb2[1] sda2[0]
1047424 blocks super 1.2 level 6, 64k chunk, algorithm 2 [4/4] [UUUU]
md0 : active raid1 sdd1[3] sdc1[2] sdb1[1] sda1[0]
4190208 blocks super 1.2 [4/4] [UUUU]
unused devices: <none>
But I still recommend you to replace the WD Green drive as soon as possible.
joey123
Feb 10, 2017Tutor
OK, I started these RAID scrubs manually using the usual mdadm tool...
echo check > /sys/block/md0/md/sync_action
Looks like this will take a few days to finish, so I guess I'll just let it run and see if it turns up any issues.
The two small partitions finished quickly and look fine. Doing the two huge data partitions now. The interesting thing is that a raw btrfs scrub:
btrfs scrub start /data
Is what really takes a long time. The RAID scrub looks like it will finish about as quickly as yours will. I guess if the RAID scrub turns up an issue, we'll know what the problem is. If your theory is correct, only the 4-disk partition should have issues, as the 3-disk partition is striped only across the 3TB NAS disks.
Let's see what happens, I'll report back in about 2 days when this has finished.
root@readyNAS:/data/Documents/readyNAS/archive# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md126 : active raid5 sda3[0] sdd3[3] sdc3[2] sdb3[1]
5845988352 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
[>....................] check = 0.3% (6412848/1948662784) finish=2593.8min speed=12479K/sec
md127 : active raid5 sdc4[0] sda4[2] sdb4[1]
1953245824 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
md1 : active raid6 sda2[0] sdd2[3] sdc2[2] sdb2[1]
1047424 blocks super 1.2 level 6, 64k chunk, algorithm 2 [4/4] [UUUU]
md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
4190208 blocks super 1.2 [4/4] [UUUU]
unused devices: <none>
- LeiaFeb 10, 2017NETGEAR Employee Retired
The resync speed wouldn't be consistent. So please check the speed again in some hours.
- joey123Feb 12, 2017Tutor
So the RAID scrub has now completed. I finished sometime yesterday, so it took approximately 2-3 days to run. This is about what I would expect, maybe a little slow, but nothing extreme. It regularly reported speeds between 15-20 MB/sec, which is again, roughly reasonable.
When I run the BTRFS scrub, I'm getting far worse, about 3 MB/sec. Here's what that looks like.
root@readyNAS:/data/Documents/readyNAS/archive# btrfs scrub status /data/
scrub status for bfb437e8-16ee-444c-b4c8-48cee4f845e1
scrub resumed at Sun Feb 12 03:29:12 2017, running for 07:43:52
total bytes scrubbed: 88.23GiB with 0 errors
So we have an almost 10x difference in speed between the RAID scrub and the BTRFS scrub. This scrub will apparently take something like a month in order to complete. This really just illustrates the problem. The hardware is clearly capable of scanning the disks at around 15-20MB/sec, everything I've ever seen indicates this. For instance, when I read files off of the readyNAS, they move at speeds around that level, when I sha1sum them, that proceeds at around that speed. RAID scrubs proceed at around that speed, as do defrags and rebalances, and everything else. Everything, that is, except a BTRFS srub, which proceeds 10x slower. And it wasn't always like this. Under software version 6.2.4, BTRFS scrub was about this speed too, roughly 15-20 MB/sec.
Leia, can you run a BTRFS scrub for a few hours on your unit and see if it is proceeding quickly or slowly? I would love to see if you are hit by the same bug.
Thanks.
- LeiaFeb 13, 2017NETGEAR Employee Retired
Yes, let me reproduce this issue locally.
And could you please answer me some questions?
1. You just ran a BTRFS scrub only? How did you do that? Or you ran RAID scrub and BTRFS scrub at the same time?
2. I checked your logs. There was 7.3TB capacity in all. And 3.9TB used. Was this the same situation you ran scrub last time?
3. What kind of files in your box? Large movie files, or documents, music files that > 1MB and < 20MB? Or large number of small files < 1MB?
Thank you,
-Leia
- joey123Feb 14, 2017Tutor
1) From the commandline, run "btrfs scrub start /<pathToBtrfsVolume>", and this will start a scrub. In my case, it's "btrfs scrub start /data/", you can check on the status of it with "btrfs scrub status /data/". This will run a BTRFS scrub only, no RAID scrub.
2) This is roughly the same as the situation last time. Previously it was perhaps slightly less used space (~3TB), but as the time difference is approx 10x, I don't think the minor difference in data size is a factor.
3) Roughly 1TB is .sparsebundle files, where each physical underlying file is roughly 8MB in size, but this was always the case, even when the scrub was fast. Other than that, it's largely a handful of very large zip files and disk images, many of them several GB each. So I'd say the files are tending towards larger sizes, probably averaging 10-100 MB in size.
Try to run the btrfs scrub from the command line, and let me know what you see. It would be very interesting to see what sort of performance you get out of it.
- aalexandrebetaFeb 14, 2017Master
- joey123Feb 17, 2017Tutor
Hi Leia,
Did you ever get a chance to run this command ("btrfs scrub start /data")? Probably wouldn't take more than a few hours for it to be clear whether or not you were hitting the same issue I am.
-Tyler
- LeiaFeb 17, 2017NETGEAR Employee Retired
I haven't reproduce your issue on my RN104 yet. The speed is 138MB/s. I want to fill more data in my box and try again. Do you mind change your WD drive and try again?
scrub status for 4814067b-f1a9-4dc2-afb2-99879e706058
scrub started at Fri Feb 17 04:05:24 2017, running for 01:01:51
total bytes scrubbed: 500.25GiB with 0 errors
scrub status for 4814067b-f1a9-4dc2-afb2-99879e706058
scrub started at Fri Feb 17 04:05:24 2017, running for 01:01:56
total bytes scrubbed: 500.86GiB with 0 errors
scrub status for 4814067b-f1a9-4dc2-afb2-99879e706058
scrub started at Fri Feb 17 04:05:24 2017, running for 02:22:16
total bytes scrubbed: 1.13TiB with 0 errors
scrub status for 4814067b-f1a9-4dc2-afb2-99879e706058
scrub started at Fri Feb 17 04:05:24 2017, running for 02:22:21
total bytes scrubbed: 1.13TiB with 0 errors - joey123Feb 18, 2017Tutor
Those numbers look very much in line with what I would expect, in fact a bit better than I would expect.
I'm ordering a new drive to replace the WD Green, and then we'll see. It will likely take me almost a week to get the new drive in, get the RAID rebuilt, and then do the test, so stay tuned.
- mdgm-ntgrFeb 18, 2017NETGEAR Employee Retired
Before you replace the disk I would suggest that you make sure that your backup is up to date.
- joey123Feb 23, 2017Tutor
Definitely wise. I've got everything backed up, pulled the green today and replaced with a helium filled WD Red 8TB, should be considerably better than the 3TB seagates I have alongside it.
The RAID is rebuilding, should take about a day, then I'll redo the scrub and see if it behaves any better. The green is still checking out as healthy, so not sure what I expect. Would be a bit surprised if this solved the problem. Probably still a bug either way. If the green is so bad as to slow things to a crawl, the system should be detecting that something is wrong, and I also would have expected it to affect (for instance) the RAID scrub as well.
Anyway, we'll see soon.
- aalexandrebetaFeb 23, 2017Master
glad to hear that the situation evolve!!
- joey123Feb 25, 2017Tutor
Disk is swapped, rebuild and reshape are done. Then I ran the scrub, and here's what I've got.
root@readyNAS:~# btrfs scrub status /data
scrub status for bfb437e8-16ee-444c-b4c8-48cee4f845e1
scrub started at Sat Feb 25 01:37:29 2017, running for 07:34:32
total bytes scrubbed: 60.08GiB with 0 errors
That's about 10 GB/hour. To do the whole ~4TB of data would therefore take roughly 400 hours, or about 20 days. This is pretty much the same as before.
This result isn't surprising, since we know it's not a hardware problem, because it was introdued by a software update.
Is there anyone at Netgear who can attempt to fix this bug?
- aalexandrebetaFeb 25, 2017Master
It's a hell of a mess you are into!!!!
Good luck and patience!!
Cheers!
- joey123Feb 25, 2017Tutor
Yeah, thanks.
The problem isn't devastating, because the NAS works just fine in every other way. I can even literally read every byte off it, and thus accomplish the scrub, in a reasonable amount of time. It's just the scrub functionality itself that is totally broken.
I guess let's see if Netgear tries to figure out a fix.
- aalexandrebetaFeb 25, 2017Master
joey123you are learning the hard way and learning fast!!!!
In front of those kinds of mess I fell like a complete moron (excuse my French and I am a Frog as you noticed the way I slaughter the Cheakspear language!):)!!!
- StephenBFeb 26, 2017Guru - Experienced User
joey123 wrote:
root@readyNAS:~# btrfs scrub status /data
scrub status for bfb437e8-16ee-444c-b4c8-48cee4f845e1
scrub started at Sat Feb 25 01:37:29 2017, running for 07:34:32
total bytes scrubbed: 60.08GiB with 0 errors
That's about 10 GB/hour. To do the whole ~4TB of data would therefore take roughly 400 hours, or about 20 days. This is pretty much the same as before.
FWIW I just did a scrub on an RN52x NAS with 3x8TB disks in RAID-5. The volume has about 8.5 TiB of data on it. It was run from the web ui, so it was the btrfs scrub + raid scrub combination. The disks are all WD80EFZX.
The scrub took 31 hours. I didn't check CPU utilitization while it was running though.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!