Forum Discussion

Prodigy

Jan 15, 2017

6.6.1 Scrub Hammers CPU

And when I say hammers, I mean it runs with -1 prio and causes 6-8 kernel worker threads each vying to consume 100% of CPU. So rabid is this consumption that all other background processes, including...

Performance

Skywalker

NETGEAR Expert

Jul 12, 2017

It's worth noting that the resources consumed by btrfs scrub can vary widely depending on the files it's scrubbing at the time. Very fragmented files will result in a lot more I/O wait time, and more CPU consumption. Also, a ReadyNAS volume scrub also includes a MD RAID scrub, which adds to the I/O load. MD is generally pretty good at yielding to other I/O consumers, but it's still a minimum 30MB/sec used for that plus the CPU usage for parity calculation.

Michael_Oz

Luminary

Jul 13, 2017

Still on 6.7.4.

Started scrub.

Three threads as described above, main process & two threads (shown with H command to htop)

 PPID   PID USER         IO  IORR  IOWR IO PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
    1 29003 root      41462 41462     0 B3  19  -1 32180   200    16 S  5.6  0.0  0:26.28 `- btrfs scrub start /N316AR6
    1 29005 root          0     0     0 B3  19  -1 32180   200    16 S  0.0  0.0  0:00.03 |  `- btrfs
    1 29004 root      41462 41462     0 id  19  -1 32180   200    16 D  5.6  0.0  0:26.26 |  `- btrfs

I'm reading that as the IO/IORR/IOWR & CPU usage is accounted for in the main process too, rather than both doing I/O & CPU.

Note active thread is idle iopriority, nice -1 (as most things spawned from readynas seem to inherit -1).

However, sort by CPU, (K command to show kernal threads), everything with CPU > 0.

 PPID   PID USER         IO  IORR  IOWR IO PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
    2 29066 root          0     0     0 B4  20   0     0     0     0 R 39.9  0.0  3:52.37 kworker/u8:10
    2 29043 root          0     0     0 B4  20   0     0     0     0 R 38.2  0.0  4:47.15 kworker/u8:9
    2 28656 root          0     0     0 B4  20   0     0     0     0 R 37.0  0.0  4:37.22 kworker/u8:2
    2 28793 root          0     0     0 B4  20   0     0     0     0 R 37.0  0.0  4:42.18 kworker/u8:3
    2 30086 root          0     0     0 B4  20   0     0     0     0 R 35.8  0.0  1:05.82 kworker/u8:14
    2 28903 root       57.2  57.2     0 B4  20   0     0     0     0 S 33.4  0.0  4:37.18 kworker/u8:8
    2 28071 root          0     0     0 B4  20   0     0     0     0 R 33.4  0.0  4:27.62 kworker/u8:0
    2 29095 root       19.2  19.2     0 B4  20   0     0     0     0 R 31.6  0.0  1:44.06 kworker/u8:13
    2 28902 root          0     0     0 B4  20   0     0     0     0 R 31.0  0.0  4:56.74 kworker/u8:7
    2  1515 root          0     0     0 B4  17  -1     0     0     0 R 25.6  0.0 35h04:21 md127_raid6
    2 27507 root          0     0     0 B4  20   0     0     0     0 R 17.9  0.0  4:31.39 kworker/u8:1
    2 29067 root          0     0     0 B4  20   0     0     0     0 R 17.9  0.0  4:30.13 kworker/u8:11
    1 29003 root      45490 45490     0 B3  19  -1 32180   200    16 S  6.0  0.0  1:12.96 btrfs scrub start /N316AR6
    1 29004 root      45517 45517     0 id  19  -1 32180   200    16 D  6.0  0.0  1:12.87 btrfs
    2 29001 root          0     0     0 ??  39  19     0     0     0 R  3.0  0.0  2:22.40 md127_resync
28517 28522 root          0     0     0 B4  20   0 29460  3688  3004 R  3.0  0.2  0:56.68 htop
    2  1402 root          0     0     0 B0   0 -20     0     0     0 S  1.8  0.0  0:54.53 kworker/1:1H
    2  1401 root          0     0     0 B0   0 -20     0     0     0 S  1.2  0.0  0:50.00 kworker/0:1H
    2  1359 root          0     0     0 B0   0 -20     0     0     0 S  0.6  0.0  3:29.12 kworker/2:1H
    2  1389 root          0     0     0 B0   0 -20     0     0     0 S  0.6  0.0  3:34.20 kworker/3:1H
    2     7 root          0     0     0 B4  20   0     0     0     0 R  0.6  0.0  4:10.65 rcu_sched
    1  3703 nut           0     0     0 B4  20   0 17240  1296   932 S  0.6  0.1  1:31.42 /lib/nut/usbhid-ups -a UPS

Note all the CPU chewed up by kworker threads, Nice 0, doing little IO. (kworker IO increased later, similar numbers as above, but more threads doing IO)

btaroli's media processes (infering from the above iopriority B7/B6 - I don't know what they are doing) will have lesser nice values, and be CPU constrained. Those at B4 will have Nice 0, round-robin with all those kworker's, so ~1/12th CPU (?5/17th?).

How changing the iopriority of the other two threads can change this I can't fathom ATM.

Perhaps media threads with a 'interactive' workload should run nice -1 for now.

Longer term there should be a gap between current -1 processes (readynas etc) and default worker thread priority (currently 0 ), so intermidiate priority things can fit in the middle??

I'll repeat this on 6.7.5 when I get around to it, I'm currently juggling 10TB disks upgrading...nothing happens fast...

Michael_Oz
Luminary
Jul 13, 2017
p.s. I run encryption, so CPU may be higher in my case. (?)
- Skywalker
  NETGEAR Expert
  Jul 13, 2017
  Yes, the crypto functions are executed by kworkers.
  - btaroli
    Prodigy
    Jul 13, 2017
    Well, this really repeats other information already shared. Most importantly that the kworker threads are monopolizing ALL the CPU's with "kernel" usage during this process. It was explained more than once -- not our first time at this rodeo -- that this wasn't supposed to be an issue because the btrfs process was nice'd. If, however, the kernel overhead pushes the CPUs so hard nothing else of lower priority can get time that's an issue.
    
    As for the behavior of the processes I've been watching, CPU load for PLEX tends to scale with the resolution and bitrate of the media; so television is generally more efficient than movies. Indeed, I noticed more (but brief) pauses with movies. DVBLink is much heavier on the CPU because it relies upon ffmpeg for transcoding. Consequently, it demonstrates impact from the CPU bottleneck much more quickly. Time Machine (netatalk) seems to have shorter spurts of CPU, across three or four processes.
    
    From an IO perspective, PLEX and DVBLink are very light. Time Machine (netatalk) has longer periods where I observed it getting to 20-30 MB/sec during a backup I triggered manually during the scrub. Interestingly, the scrub process seemed also to range between 20-35 MB/sec peak... even when the netatalk was pushing the same. Of course, the scrub is doing nearly 100% read activity whereas the netatalk was doing mostly writes (for the backup).
    
    Honestly, though, I think the issue is CPU. My DVBLink has had a background transcode running (to "save" a recording off for keeping) over 24 hours, at nice 5 and B5 ioprio. It's IO rate is nominal, <5000, but it's CPU is high... 350-370% This process didn't die during the scrub, but it was clearly starved.
    
    So I'll poke some more, but I honestly believe that it's CPU at issue here. Perhaps shifting the scrub to lower ioprio constrains it from getting data more rapidly, but I find it hard to imagine given that there weren't generally other heavy IO sources.
    
    So, ultimately, I have to wonder... is there a way to protect other services on the NAS from the kworkers doing their level best to hammer the CPU? From Skywalker's notes, the kernel processes are likely busy computing checksums to verify and perhaps there is also mdadm overhead. But is there a way to either (a) pin them to certain CPU threads or (b) constrain their level of activity to ensure other processes can get time? Clearly, you don't want to nice kernel threads... but I'm not sure what other levers are available there.