× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

BTRFS scrub speed is insanely slow.

joey123
Tutor

Re: BTRFS scrub speed is insanely slow.

 

CPU usage is roughly 0.0% when I'm not running a scrub, and it's about 100.0% when I am. 

 

It's not SMB that's causing these problems. When the thing is scrubbing, smbd is using less than 1% CPU. Essentially all the CPU time is going to the kernel tasks that are doing the actual scrub. 

Message 26 of 77

Re: BTRFS scrub speed is insanely slow.

try to disengage all the things then defrag, balance and scrub.

Message 27 of 77
StephenB
Guru

Re: BTRFS scrub speed is insanely slow.


@joey123 wrote:

 

I'm suspecting that some part of this resolution (or associated workd) made the scrub logic really, really not play nicely with the RAID acceleration hardware that these systems have.  

  


That suggests that all RN104 users would be seeing this.  That is possible, but would be news to me.

 

Though I suspect you've already gotten past this - is there anything in the smart stats that suggests a growing disk problem?

 

 

Message 28 of 77
joey123
Tutor

Re: BTRFS scrub speed is insanely slow.

 

 

I checked what I could find here, and it came up clean. Do you know a good way to check the SMART status of the disks? I guess it's possible that I missed something. 

 

Also, regarding other RN104 users seeing this. Remember that the unit comes with no scheduled scrubs set up out of the box, and it's an old model that hasn't been made for a while. If it required (for instance) mismatched disk sizes in X-RAID5, it's plausible that very few people would have noticed. I'm suspecting that the sorts of power users who would (for instance) frequent this message board are almost exclusively focused on the higher end models. 

Message 29 of 77
StephenB
Guru

Re: BTRFS scrub speed is insanely slow.

You can check the smart stats by downloading the log zip file.  Disk_info.log is a good place to look, as is smart_history.log.

 

There are other RN104 users who post here.  I have an RN102 that I'm not using right now, but that might behave differently than an RN104 running XRAID with mismatched disks.

Message 30 of 77
joey123
Tutor

Re: BTRFS scrub speed is insanely slow.

 

Thanks, this is helpful to know. The tables look pretty clean to me...

 

time model serial realloc_sect realloc_evnt spin_retry_cnt ioedc cmd_timeouts pending_sect uncorrectable_err ata_errors
------------------- -------------------- -------------------- ------------ ------------ -------------- ---------- ------------ ------------ ----------------- ----------
2015-02-21 19:03:53 WDC WD20EARX-32PASB0 WD-WCAZAF733419 -1 -1 -1 -1 -1 -1 -1 0
2015-02-21 19:03:53 ST3000DM001-1ER166 W500F7RK -1 -1 -1 -1 -1 -1 -1 0
2015-02-21 19:03:53 ST3000DM001-1ER166 W500F8PV -1 -1 -1 -1 -1 -1 -1 0
2015-02-21 19:03:53 ST3000DM001-1ER166 W500F7XV -1 -1 -1 -1 -1 -1 -1 0
2015-02-21 19:05:41 WDC WD20EARX-32PASB0 WD-WCAZAF733419 0 0 0 -1 -1 0 0 0
2015-02-21 19:05:41 ST3000DM001-1ER166 W500F7RK 0 0 0 0 0 0 0 0
2015-02-21 19:05:41 ST3000DM001-1ER166 W500F8PV 0 0 0 0 0 0 0 0
2015-02-21 19:05:41 ST3000DM001-1ER166 W500F7XV 0 0 0 0 0 0 0 0

 

And disk_info.log doesn't look like it has anything worrying. The first HD (the 2TB one) is a bit older than the others, and it spent a lot of its life in a computer where it spun up and down a lot (It's also a WD green, known for this sort of thing), but it doesn't look like it is reporting any issues. I think the disks are no better or worse off now than they were a year or so ago when I first started having these issues. 

 

Device: sdd
Controller: 0
Channel: 0
Model: WDC WD20EARX-32PASB0
Serial: WD-WCAZAF733419
Firmware: 51.0AB51
Class: SATA
Sectors: 3907029168
Pool: data
PoolType: RAID 5
PoolState: 1
PoolHostId: 2fe4ed8e
Health data
ATA Error Count: 0
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 43
Start/Stop Count: 264
Power-On Hours: 31597
Power Cycle Count: 76
Load Cycle Count: 1470451

Device: sdc
Controller: 0
Channel: 1
Model: ST3000DM001-1ER166
Serial: W500F7RK
Firmware: CC25
Class: SATA
RPM: 7200
Sectors: 5860533168
Pool: data
PoolType: RAID 5
PoolState: 1
PoolHostId: 2fe4ed8e
Health data
ATA Error Count: 0
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
End-to-End Errors: 0
Command Timeouts: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 49
Start/Stop Count: 44
Power-On Hours: 15656
Power Cycle Count: 43
Load Cycle Count: 85

Device: sdb
Controller: 0
Channel: 2
Model: ST3000DM001-1ER166
Serial: W500F8PV
Firmware: CC25
Class: SATA
RPM: 7200
Sectors: 5860533168
Pool: data
PoolType: RAID 5
PoolState: 1
PoolHostId: 2fe4ed8e
Health data
ATA Error Count: 0
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
End-to-End Errors: 0
Command Timeouts: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 50
Start/Stop Count: 42
Power-On Hours: 15657
Power Cycle Count: 42
Load Cycle Count: 82

Device: sda
Controller: 0
Channel: 3
Model: ST3000DM001-1ER166
Serial: W500F7XV
Firmware: CC25
Class: SATA
RPM: 7200
Sectors: 5860533168
Pool: data
PoolType: RAID 5
PoolState: 1
PoolHostId: 2fe4ed8e
Health data
ATA Error Count: 0
Reallocated Sectors: 0
Reallocation Events: 0
Spin Retry Count: 0
End-to-End Errors: 0
Command Timeouts: 0
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 46
Start/Stop Count: 42
Power-On Hours: 15657
Power Cycle Count: 42
Load Cycle Count: 82

 

Message 31 of 77
StephenB
Guru

Re: BTRFS scrub speed is insanely slow.

The disk health looks fine to me too.

 

FWIW, the 3 TB Seagate DM drives are known to have high failure rates with RAID.  But there's no evidence of problems with your particular drives.

 

When the time comes to replace them (and the WD green), I recommend using NAS-purposed drives - WDC Red or Seagate Ironwolf models.  I use Reds myself.

 

So it's not the disk health, and it's not something loading down the CPU (other than the scrub itself).  Have you tried measuring NAS throughput when scrubs aren't running?  For instance using NAStester on a PC?  http://www.808.dk/?code-csharp-nas-performance

Message 32 of 77
joey123
Tutor

Re: BTRFS scrub speed is insanely slow.

 

Yes, I have. In particular, to take the network out of the picture, I've tried to just sha1sum the files on the disk. This is a worst case, since the sha1 itself should take quite a lot of CPU. It runs much faster than the scrubs, around 1.5 GB/minute. At that rate, I would go through the full 4TB in about 1.5-2 days, which is just what I see if I run an sha1sum on every file using find. This is what I would expect to see from the scrub, or better. So the NAS has no trouble reading all the data off these disks in some reasonable amount of time. 

 

This is totally a software bug. 

 

a) Same hardware

b) Only the OS version changed

c) Nothing running

d) No hardware problems

e) Hardware has no problem reading and even sha1 hashing the data in a reasonable amount of time

f) defrags and rebalances work in a reasonable amount of time (~1 day)

g) Scrubs are horrifically slow (~20+ days), at least 10x slower than anything else that runs on this thing. 

 

There's really nothing else it could be. 

Message 33 of 77
StephenB
Guru

Re: BTRFS scrub speed is insanely slow.

It would have to be a bug/performance bottleneck in BTRFS itself.

 

BTRFS checksums use CRC32c, and they are block-based not file based (done on 4K blocks). CRC32 should be significantly faster than SHA-1 (perhaps 60% fewer cycles).  Since checksum verification is always done, your sha1sum test is actually computing both the BTRFS checksums and the SHA-1 hashes.

 

FWIW, my last scrub took about 31 hours on my RN526.  Disk configuration is 4x6TB RAID-5, with about 8.5 TiB of data (including snapshots).  Roughly 75 MB/s.

 

Message 34 of 77

Re: BTRFS scrub speed is insanely slow.

For 4.2Tb raid 1 it took me around 15hrs!

On my RN316.

Message 35 of 77
StephenB
Guru

Re: BTRFS scrub speed is insanely slow.


@aalexandrebeta wrote:

For 4.2Tb raid 1 it took me around 15hrs!

On my RN316.


That sounds about right actually.

Message 36 of 77

Re: BTRFS scrub speed is insanely slow.

If I compare my skills to @StephenB and @mdgm-ntgr I am feeling a complete jackass!!!

Appart of commenting the perf and other cosmetic stuff I leave @joey123 with the big boys 🙂 🙂 !

Message 37 of 77
kohdee
NETGEAR Expert

Re: BTRFS scrub speed is insanely slow.

When you schedule/run SCRUB from the UI, it runs both a scrub on the RAID as well as a btrfs scrub. 2x the load. 

RN100 series devices are single core, low powered CPU devices. They take quite some time to accomplish tasks compared to other units. 

Message 38 of 77

Re: BTRFS scrub speed is insanely slow.

Thank you to educate us!! I appriciate @kohdee's remarks. Could you eplainhow scrub process is running?

Message 39 of 77
joey123
Tutor

Re: BTRFS scrub speed is insanely slow.

 

All true, but this doesn't come close to explaining the phenomenon. 

 

a) It worked well in the past (ran in 1-2 days)

b) I can read and verify all the data on the disk using commandline tools on the NAS itself in 1-2 days. 

 

So the issue isn't related to CPU speed as that hasn't changed, and in any case would slow down other operations on the NAS as well. Also, it can explain maybe a 2x difference in speed. If the difference was only 2x, I would never have even mentioned it, but the difference is more like 20x, and it came around once an updated version of the OS was installed. 

 

It's definitely a software bug that needs to be fixed. 

 

Message 40 of 77
Leia
NETGEAR Employee Retired

Re: BTRFS scrub speed is insanely slow.

Hi joey123,

 

Your WD disk's Load Cycle Count has been 1,462,660. Actually Western Digital rates their Green drives for 300,000 cycles. Please check this page: S.M.A.R.T. So please replace this disk. WD Green drives are designed to unload heads often to conserve power. The LCC would be increased very fast when using in storage. We don't advise to use all WD Green drives in our NAS because of this. 

 

And I saw you enabled the data checksum(on Volumes page). Actually we disable this option for all arm based NAS boxes by default because it would affect the performance.  

 

Thanks,

-Leia

Message 41 of 77

Re: BTRFS scrub speed is insanely slow.

"1,462,660" isn't it a glitch in smart?

Message 42 of 77

Re: BTRFS scrub speed is insanely slow.

It would make a restart every minute and a halh in 526 days running! Doing some maths!

For me there is a glitch or an error!

Message 43 of 77
joey123
Tutor

Re: BTRFS scrub speed is insanely slow.

 

The number is right. The greens park after 8 seconds of inactivity, so this is about what is expected. 

 

However, it's still not the cause of the issues. I agree with the parent that eventually I need to get rid of that drive and replace it with another. However, for now, it's not reporting any errors, and when I read data off it, it's perfectly fast. It's only the scrub that's slow. 

 

Message 44 of 77

Re: BTRFS scrub speed is insanely slow.

You can by the way ease up on green feature and allow more time until the HDD iddles!

Message 45 of 77
joey123
Tutor

Re: BTRFS scrub speed is insanely slow.

 

Yes, but it requires reflashing the firmware on the drive. I'll probably actually do that this weekend, but it's not a minor task as it require yanking the thing out of the NAS and connecting it to another computer. 

Message 46 of 77

Re: BTRFS scrub speed is insanely slow.

No just tweak the setting on your device not the HDD!

Message 47 of 77
StephenB
Guru

Re: BTRFS scrub speed is insanely slow.


@aalexandrebeta wrote:

"1,462,660" isn't it a glitch in smart?


The number isn't out of the ballpark.  The heads park themselves after 8 seconds when not in use.  He's seeing about 1 head-park every 75 seconds, which is high but not impossible.

 

I've been using WD20EARS for years in my NV+, they have counts around 800K (over 50K hours of service).  I have healthy disks in a PC which do have 1.5M or more.  So although @Leia is right on the 300K spec, in my experience there's no need to replace a disk if the only concern is the load cycle count.

 

While you can change the threshold with widdle3, it's a bit late to worry aboout that now.  I'd just it it run to failure, and then replace with with a NAS-purposed drive.

 

 

 

Message 48 of 77
TeknoJnky
Hero

Re: BTRFS scrub speed is insanely slow.

In my experience, whenever a nas starts running slow or doing odd things, its the pre-cursor to a failed drive.

 

If I were you, I would definately aquire a replacement drive as soon as possible.

 

Even if it doesn't fail right away, having a spare drive on hand will minimize the amount of time your device spends non-redundant.

 

Also, please understand that scrubbing checks through every sector of data every time, while defrag and balance only touch the files/extents that need to be, which is why they can be much faster.

 

Message 49 of 77

Re: BTRFS scrub speed is insanely slow.

@TeknoJnky  I agree totaly!

Message 50 of 77
Top Contributors
Discussion stats
  • 76 replies
  • 5633 views
  • 3 kudos
  • 8 in conversation
Announcements