- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
Re: EDA500 on RN516 - Scrub very slow
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
EDA500 on RN516 - Scrub very slow
This problem has been previously reported by another user on an earlier version of the OS: ReadyNAS-516-2x-EDA500-Scrub-on-EDA500-very-slow. It persists in OS6.7.5.
As scheduled, my main data volume (6x6TB XRAID=21.88TB) started a scrub and competed it in about 30 hours. My eda-1 volume (5x4TB XRAID=12.5TB) then began it's scrub. That initially locked up the NAS. The log shows eda scrub starting before data volume scrub was finished, which may be the reason for the lockup and likely extended the data volume scrub time. I need to space the out more in the schedule). I had to reboot and then manually re-started the eda scrub. It's going to take a lifetime (0.01% complete in 5 hours). Top shows CPU usage of around 4% for md125_raid5 with a priority of 20 and 2% for md125_resync.with a priority of 39.
The really off thing is that I swear the eda volume scrub was in the 20% range at lock-up even though it had only been running a bit over a day. Could it have been running faster when the data volume scrub was ongoing?
Is there anything I can do to safely speed this up?
When will Netgear address this issue?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
Interresting that there is no response form Netgear on this one.
An update: I stopped the scrub and re-scheduled the start. It did re-start (from scratch, not where left off) and was going a little faster -- it was going to take days instead of weeks. Once again, I lost access to shares and admin interface, but SSH worked this time. cat showed fwbroker processes taking up a huge chunk of the CPU. I stopped the sync via SSH (btrfs scrub cancel) to see I could then access my data. Sure enough, the fwbroker tasks stopped or settled to near zero CPU and share and admin access were restored. I then resumed the scrub (btrfs scrub resume) and it is going MUCH faster, though share and admin access are still badly affected. It had taken 56 hours to scrub a little over 2Tib and in the last 12 hours has done another 4 (now only visible via SSH (btrfs scrub status) since the GUI is not aware of the scrub restart.
- Why does scrub on the EDA500 take so much longer than the main array?
- Why does scrub on the EDA500 take up so much CPU time that the NAS becomes inaccessible, yet it's not helping the scrub duration? That never happened on the array of my older Pro's running 6.x or the main array of my RN516.
- Why can't we resume a scrub via the GUI?
- Why does the resumed (GUI not in the loop) scrub work so much faster (times more like the main RN516 array)?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
Thank you so much for this post, as I was thinking I was the only one with this issue. Aparently you an I were the only ones who purchased the EDA500 🙂 I too am having this issue... I was likely the one you referenced as another user, but I have had this issue across multiple versions and across multiple factory defaults. In fact, after hearing that a factory default may help I purchased a RN316 + 6x8TB drives (yep quite a bit of money). This purchase was soley to backup the RN516 volumes and factory defualt it. I did this and built the volume on the EDA500 with 5x WD60EFRX drives, effectively 16TB in size. The volume created on the RN516 in less than 2 days time. Since then I have copied all my data back and been running successful, all the time thinking I have fixed the issue. It has been a few months, so I figured I should run the defrag, balance and yes, the scrub on the volumes. I started with the data volume which is 6x wd80EFRX drives (nearly 30TB) in RAID6 and the scrub completed in less than 36 hours (very fast). So I proceeded to start the scrub on the EDA500 volume (eda1). I started the scrub over 48 hours ago an it is under 4% complete and FrontView is basically useless during this process. Keep in mind this is a RAID6 volume with 6TB drives, with a total volume size of about 16TB and it is running about 100 times slower than the data volume. This is a problem with the EDA500 and I am not sure why I was unable to get a response from @mdgm-ntgr, @StephenB or someone else within Netgear. I still have enough space on the 316 to rsync the data, AGAIN, and remove the eda1 volume to rebuild it, but it is a terrible solution. Netgear, please comment on this issue with a workaround or a solution to make the eda500 useful. It is unfortunate that I keep mostly inconsequential data on the eda's (yep I have 2 and regret both of them), because I do not trust either of them. The other thread I opened and got no response from is the following:
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
top - 00:28:48 up 180 days, 19:53, 1 user, load average: 7.23, 4.97, 4.90 Tasks: 278 total, 1 running, 277 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.2 us, 3.5 sy, 0.0 ni, 2.3 id, 93.8 wa, 0.0 hi, 0.2 si, 0.0 st KiB Mem: 16298148 total, 15713032 used, 585116 free, 235312 buffers KiB Swap: 2094844 total, 4 used, 2094840 free. 13143588 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9332 root 20 0 0 0 0 S 8.3 0.0 637:58.07 md126_raid6 22272 root 39 19 0 0 0 D 4.0 0.0 91:25.11 md126_resync 20002 root 20 0 0 0 0 S 0.7 0.0 0:04.52 kworker/u8:10 3659 root 20 0 992516 8868 4920 S 0.3 0.1 831:17.00 leafp2p 3807 root 19 -1 2385380 105452 9828 S 0.3 0.6 272:46.17 readynasd 16698 snmp 20 0 73784 21028 3168 S 0.3 0.1 106:19.94 snmpd 17715 root 20 0 0 0 0 S 0.3 0.0 0:09.30 kworker/u8:0 20001 root 20 0 0 0 0 S 0.3 0.0 0:03.74 kworker/u8:9 22275 root 19 -1 32144 196 12 S 0.3 0.0 3:05.40 btrfs 22318 root 20 0 0 0 0 S 0.3 0.0 0:02.22 kworker/u8:1 31738 root 20 0 0 0 0 D 0.3 0.0 14:33.99 nfsd 1 root 20 0 202328 5312 3384 S 0.0 0.0 0:53.92 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:04.63 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 5:45.48 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 56:41.87 rcu_sched 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root rt 0 0 0 0 S 0.0 0.0 0:07.15 migration/0 10 root rt 0 0 0 0 S 0.0 0.0 0:41.41 watchdog/0 11 root rt 0 0 0 0 S 0.0 0.0 0:40.48 watchdog/1 12 root rt 0 0 0 0 S 0.0 0.0 0:07.94 migration/1 13 root 20 0 0 0 0 S 0.0 0.0 3:45.60 ksoftirqd/1 15 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H 16 root rt 0 0 0 0 S 0.0 0.0 0:42.05 watchdog/2
Notice over 93% wait and almost 0% CPU consumption. The resync process is still running and has been for over 62 hours and is currently at 3.5%.
I have no (0) installed applications and AntiVirus is disabled. In fact, after the factory reset a few months ago I have really done nothing with it.
I am willing to leave it running and provide logs and/or remote access for Netgear to troubleshoot and determine the cause. It seems this issue is now affecting more than just me.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
@dbreiny1 wrote:
This is a problem with the EDA500 and I am not sure why I was unable to get a response from @mdgm-ntgr, @StephenB or someone else within Netgear.
I don't work for Netgear, and I don't own an EDA500. I didn't see much I could do to help resolve your issue, so I didn't chime in.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
And still no response from anyone at Netgear. If I run a scub on the volume on my EDA500 from SSH (btrfs scrub start /eda1), it runs 50X faster than if I run it from the GUI. That's comparable with the speed on the main array. Note that I am not using the -B option to not background the task. So what in the world is the GUI doing wrong when it starts a scrub, either scheduled or manually?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
Scrubbing from the GUI involves more than just doing a scrub at the filesystem level.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
Can you be more specific as to what it is doing if not just a BTRFS-level volume scrub? btrfs scrub status /eda1 returned data consistent with the progress bar in the GUI when I initiated the scrub via the GUI. In fact, the scrub I started with SSH also shows up in the GUI, progressing at the much accelerated pace. It's going a tad slower than the main volume -- I estimate it'll take about 18 to 20 hours -- but that's acceptable. The speed of the GUI-initiated scrub is not.
And even if it's doing something more, why does the scrub of my 14.5TB RDA500 volume take 54 days compared to 18 hours for my 27.3TB main volume? I assume that scrubbing the main volume is also doing that something more. Scrubbing the main volume uses about 50% of the CPU time -- enough to slow other processes some. Scrubbing the EDA500 via the GUI takes almost none. Scrubbing via SSH command takes about the same amount as the main volume scrub. Whatever causes that lack of CPU usage certainly seems like the probable root cause to me.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
@Sandshark I've escalated this and requested a more detailed response for you. Feel free to PM me if you don't receive a response within the next couple of days and I'll follow up.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
A scrub kicks off both a btrfs scrub and an mdadm resync. 5 disks relying on 1 eSATA connection to perform extreme recalculations on 2 fronts (RAID and FS) is very intensive and is very slow. You could move your EDA500 disks to your head unit and let the operations continue, then move them back.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
Does it do them concurrently? Maybe that's the issue, but 54 days still seems like a very long time. Why would runing two processes take less CPU time than me running just one via SSH? It didn't take anywhere near that long for the orignal sync, so why should a resync take that long? I'll have to kick off another and see what /proc/mdstat says about resync progress while this is going on. Maybe the two processes are fighting over access to the same area of the array and that slows them both down, but would that not also occur on the main array?
I have to admit I did not let it complete, but I did let it go more than two days to see if it was just the reported progress that was wrong.. Maybe it would have sped up at some point. After two days, I Googled how to find the progress in SSH and that's when I found the progress shown in SSH was identical to that shown in the GUI, so I then trusted the progress report and my resulting time to go estimate were accurate and aborted it.
As far as moving the array to the main chassis for this, that's just not a real solution. I keep everything I need daily access to on the main array and computer backups and such on the EDA500 (actually, now two of them).
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
I believe it is concurrent.
In the initial sync we can do things faster as there's not existing stuff to sync across. If you replace a disk in your EDA500 you'll find the sync to rebuild is longer than the initial sync when creating the volume.
The larger the disk capacity the longer things will take as there's more to check.
54 days for a scrub does still seem like a very long time even in an EDA500.
If moving the disks to the main chassis is not practical you could find that additional main units is better than using EDA500 units for you.
I would think a volume in any of our current main units would significantly outperform one in the EDA500
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
OK, so this is what top looks like with a GUI-intiated scrub:
top - 16:48:32 up 7 days, 22:33, 1 user, load average: 4.71, 1.55, 0.64 Tasks: 314 total, 1 running, 313 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 1.1 sy, 0.0 ni, 98.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 16297764 total, 15755896 used, 541868 free, 11572 buffers KiB Swap: 1569788 total, 0 used, 1569788 free. 14749048 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2006 root 20 0 0 0 0 S 4.0 0.0 0:05.13 md123_raid5 15024 root 20 0 0 0 0 D 1.0 0.0 0:00.57 md123_resync 4226 root 20 0 6344 1728 1600 S 0.3 0.0 10:42.07 wsdd2 4452 root 20 0 661376 14428 9676 S 0.3 0.1 2:00.16 zerotier-one 1 root 20 0 136976 7264 5144 S 0.0 0.0 0:09.02 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.18 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:12.67 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 2:40.28 rcu_sched 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root rt 0 0 0 0 S 0.0 0.0 0:02.24 migration/0 10 root rt 0 0 0 0 S 0.0 0.0 0:01.88 watchdog/0 11 root rt 0 0 0 0 S 0.0 0.0 0:01.86 watchdog/1 12 root rt 0 0 0 0 S 0.0 0.0 0:01.69 migration/1 13 root 20 0 0 0 0 S 0.0 0.0 0:09.43 ksoftirqd/1 15 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
And here it is is with scrub initiated via SSH:
top - 16:58:13 up 4 min, 1 user, load average: 1.90, 0.98, 0.42 Tasks: 316 total, 1 running, 315 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 13.8 sy, 0.0 ni, 86.1 id, 0.1 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem: 16297764 total, 983956 used, 15313808 free, 11252 buffers KiB Swap: 1569788 total, 0 used, 1569788 free. 565484 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 71 root 20 0 0 0 0 S 11.0 0.0 0:07.33 kworker/u8:3 1035 root 20 0 0 0 0 S 10.0 0.0 0:07.31 kworker/u8:7 1056 root 20 0 0 0 0 S 9.6 0.0 0:07.77 kworker/u8:10 28 root 20 0 0 0 0 S 9.0 0.0 0:07.83 kworker/u8:1 43 root 20 0 0 0 0 S 7.3 0.0 0:06.73 kworker/u8:2 1054 root 20 0 0 0 0 S 6.7 0.0 0:05.52 kworker/u8:9 5609 root 20 0 32168 204 16 S 3.0 0.0 0:03.04 btrfs 1777 root 0 -20 0 0 0 S 1.3 0.0 0:01.54 kworker/2:1H 4219 root 20 0 6344 1764 1628 S 0.7 0.0 0:00.90 wsdd2 1745 root 0 -20 0 0 0 S 0.3 0.0 0:00.24 kworker/0:1H 5673 root 20 0 28892 3068 2424 R 0.3 0.0 0:00.14 top 1 root 20 0 136976 7136 5100 S 0.0 0.0 0:01.51 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 4 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 6 root 20 0 0 0 0 S 0.0 0.0 0:00.23 kworker/u8:0
Here is what it looks like if I start a scrub via the GUI and cancel it (but not the resync) via SSH:
top - 17:02:18 up 8 min, 1 user, load average: 2.73, 1.85, 0.90 Tasks: 305 total, 1 running, 304 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 1.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 16297764 total, 1024016 used, 15273748 free, 11252 buffers KiB Swap: 1569788 total, 0 used, 1569788 free. 579124 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2000 root 20 0 0 0 0 S 4.0 0.0 0:05.24 md123_raid5 4219 root 20 0 6344 1764 1628 S 0.7 0.0 0:01.72 wsdd2 6623 root 20 0 0 0 0 D 0.7 0.0 0:01.13 md123_resync 7 root 20 0 0 0 0 S 0.3 0.0 0:00.20 rcu_sched 4680 nut 20 0 17260 1508 1112 S 0.3 0.0 0:00.77 usbhid-ups 1 root 20 0 136976 7136 5100 S 0.0 0.0 0:01.54 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 4 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root rt 0 0 0 0 S 0.0 0.0 0:00.01 migration/0 10 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 12 root rt 0 0 0 0 S 0.0 0.0 0:00.01 migration/1 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1
So, yes, there is a resync in progress when scrub is initiated via the GUI that is not there when I do it via SSH. When I cancel the scrub via SSH, very little changes. If I resume the scrub with the resync still ongoing, it looks the same as if I never cancelled it. If I initiate just the scrub via SSH, then all of those kworker tasks are busy doing the scrub that are not even in the top ten processes when the resync is also ongoing. Clearly, something about having an ongoing resync is seriously affecting the scrub on the EDA500. It's not CPU availability -- the resync takes little CPU. So, it must be the I/O channel. My best guess is that the resync process is keeping the eSATA port multiplier "locked" to one drive and so the scrub process cannot access any others.
BTW, here is what cat /proc/mdstat reports on the sync:
md123 : active raid5 sdm3[0] sdq3[4] sdp3[3] sdo3[5] sdn3[1] 7794659328 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/5] [UUUUU] [>....................] resync = 3.5% (69154404/1948664832) finish=1209.6min speed=25895K/sec
So the re-sync in and of itself is also not the issue, it will complete within a reasonable time (this is for an array half the size of the other, but even double this is reasonable).
I don't know the solution -- maybe doing the processes sequentially instead of concurrently. But it is definately an unacceptable situation that need attention. Any excuse that "a second NAS is a better solution" is just that -- an excuse. Netgear sold the product, and the OS should play well with it. I could accept a 3x or even maybe 4x longer task. 25x or more is just insane.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
OK, so I just let it keep going to see if the scrub process would speed up after the sync completes. Looking in an hour out from the resync was predicted to complete, something else had happened. Readynasd was now taking almost all of the available CPU time (it actually claimed to be 100.1%). It had a lower priority than the resync, so the resync time did not seem to be affected and it did complete when it was originally predicted to do so without readynasd misbehaving. The GUI was unavailable, but SMB access still worked. I did not test if access speed was affected. There was a scheduled rsync backup of the shares in the main volume (a pull from another NAS) which seems to have taken the normal amount of time. This could be something independent of the scrub, since I don't know when it happened, but it seems suspect since I have done nothing special during that period. Here is top at that point:
top - 13:58:44 up 21:05, 1 user, load average: 6.80, 6.60, 6.49 Tasks: 502 total, 2 running, 500 sleeping, 0 stopped, 0 zombie %Cpu(s): 3.7 us, 23.7 sy, 0.0 ni, 48.2 id, 24.2 wa, 0.0 hi, 0.2 si, 0.0 st KiB Mem: 16297764 total, 15686920 used, 610844 free, 4588 buffers KiB Swap: 1569788 total, 0 used, 1569788 free. 14518584 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4446 root 19 -1 21.687g 93096 23968 R 100.1 0.6 760:36.66 readynasd 2000 root 20 0 0 0 0 D 5.3 0.0 57:42.57 md123_raid5 6623 root 39 19 0 0 0 D 1.3 0.0 18:58.47 md123_resync 7 root 20 0 0 0 0 S 0.3 0.0 2:08.55 rcu_sched 4219 root 20 0 6344 1764 1628 S 0.3 0.0 3:49.45 wsdd2 27931 root 20 0 29024 3336 2544 R 0.3 0.0 0:00.03 top 1 root 20 0 136976 7136 5100 S 0.0 0.0 0:02.29 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.03 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:01.15 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root rt 0 0 0 0 S 0.0 0.0 0:00.16 migration/0 10 root rt 0 0 0 0 S 0.0 0.0 0:00.18 watchdog/0 11 root rt 0 0 0 0 S 0.0 0.0 0:00.25 watchdog/1 12 root rt 0 0 0 0 S 0.0 0.0 0:00.15 migration/1 13 root 20 0 0 0 0 S 0.0 0.0 0:00.75 ksoftirqd/1 15 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
Now the resync is complete, all the kworker tasks are busy with the scrub, and the completion percentage has started to rise at a much faster pace. Readynasd also started behaving itself and the GUI is again available, so that pretty much shows that the CPU use it had was a result of the resync process. My best guess is that is was in a tight loop trying to do something that the resync process kept it from doing (probably related to the eSATA port expander) -- something that should probably have a time-out. Here is top now:
top - 15:19:47 up 22:26, 1 user, load average: 1.80, 2.89, 4.98 Tasks: 310 total, 2 running, 308 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 17.7 sy, 0.0 ni, 82.2 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem: 16297764 total, 15566324 used, 731440 free, 4588 buffers KiB Swap: 1569788 total, 0 used, 1569788 free. 14627468 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31692 root 20 0 0 0 0 S 13.3 0.0 0:40.69 kworker/u8:0 29540 root 20 0 0 0 0 S 11.6 0.0 0:43.98 kworker/u8:2 32539 root 20 0 0 0 0 S 11.0 0.0 0:45.62 kworker/u8:7 31857 root 20 0 0 0 0 S 10.6 0.0 0:43.23 kworker/u8:6 563 root 20 0 0 0 0 R 10.3 0.0 0:38.67 kworker/u8:8 31853 root 20 0 0 0 0 S 10.0 0.0 0:48.57 kworker/u8:5 4446 root 19 -1 2460276 76200 23968 S 3.0 0.5 833:53.16 readynasd 6888 root 20 0 32168 204 16 S 3.0 0.0 0:17.79 btrfs 1777 root 0 -20 0 0 0 S 1.3 0.0 0:38.08 kworker/2:1H 4219 root 20 0 6344 1764 1628 S 0.3 0.0 4:04.43 wsdd2 4312 root 20 0 227556 7080 5240 S 0.3 0.0 0:36.02 nmbd 31548 root 20 0 29024 3260 2484 R 0.3 0.0 0:01.34 top 1 root 20 0 136976 7136 5100 S 0.0 0.0 0:02.37 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.03 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:01.21 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 2:20.73 rcu_sched
So, it appears to me that the concurrent resync with a scrub causes the scrub to be effectively held off until the resync completes. The multipplexing effect of the eSATA port expander is the factor that is likely different from the main volume. This results in an initially reported very slow scrub completion rate which is going to cause the average user to think the scrub will take an eternity. So is doing a resync concurrent with the scrub (or, in practice, doing one first) really the best thing to be doing (at least on the EDA500)?
And what about the process locking out the GUI (it comes up as "unavailable")? I can see it being slow, but locking it up keeps one from seeing the scrub is still progressing (except via SSH, which your average user will not use) and may cause the user to think the NAS needs rebooting. There are several snapshots that took place immediately after the resync finished that were likely held off, but that seems reasonable. A scheduled balance on the main volume was also attempted at that point, but failed to start, presumably because of the scrub on the other volume.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: EDA500 on RN516 - Scrub very slow
@Sandshark wrote:
I don't know the solution -- maybe doing the processes sequentially instead of concurrently.
I would have gone with sequential myself, if only to make the results more predictable.
Both are disk-intensive, and it seems to me doing them sequentially would also reduce disk thrashing/head motion.