- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
RN516 slow share response, slow reboot, web UI times out - suggestions for debugging
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a RN516. It was running 6.9.0 when things went south. It is currently running 6.10.7.
Several weeks ago there was an overnight power failure at the office where the unit is. The unit was on UPS, but the failure was long enough that the UPS died. The unit powered back up after power was restored. One odd thing was that time was off by one hour, even though NTP was setup. I believe I toggled the timezone, did a reboot and toggled timezone back and time was then fixed.
The problem that we are seeing after this power event was occasionally slow file response, for both NFS and SMB. However, it has been getting progressively worse. In addition, shutdown and reboot take a long time – 10 to 30 minutes. It does vary a bit. In the last two days response to requests for files got really bad. Delete of a single file took 4-5 minutes. The web UI did work and was responsive.
There are several TB of free disk space. Setup is RAID-6. Snapshot are on and have been since 2016.
A reboot did not improve the share performance. I decided to upgrade firmware. The reboot for the upgrade took 30 minutes. When it completed, there were no shares, the web UI was not responding at all, but I did have SSH. From the command line, a top did not show any CPU activity more than 1% and there was a lot of free memory. I rebooted again from the command line using the command: reboot. After a long reboot, shares reappeared although they are taking 30 seconds to enumerate a directory, the web UI at least attempted to login but eventually timed out after entering username and password, and SSH still works. Again, top shows nothing unusual.
What steps should I take to debug this?
Solved! Go to Solution.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@TullyNYGuy wrote:
What steps should I take to debug this?
Have you looked at disk health? I suggest using smartctl -x from the command line.
Also, if you haven't downloaded the log zip file and looked in there, I suggest you do so.
All Replies
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: RN516 slow share response, slow reboot, web UI times out - suggestions for debugging
An update: after a long delay, the web UI did show up. It is extremely slow and portions of the pages don't seem to show up.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@TullyNYGuy wrote:
What steps should I take to debug this?
Have you looked at disk health? I suggest using smartctl -x from the command line.
Also, if you haven't downloaded the log zip file and looked in there, I suggest you do so.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: RN516 slow share response, slow reboot, web UI times out - suggestions for debugging
Using the tips from @StephenB I found that /dev/sdc had about 100 reallocated sectors and the count was increasing by about 1 every two hours. It also failed a read test started from smartctl. Apparently, that one drive was sick enough to destroy the performance of the RN516, but it was not dead. Since the unit was set up as RAID 6, all I did was pull the sick drive out. There was a hot spare already in the NAS. The unit detected that a drive had failed and started to resync the array. Share performance improved hugely even as the rebuild was running. The web UI became responsive again.
FYI, full backups are run nightly on this NAS, in addition to a full mirror to an RN204 every 12 hours, so pulling the drive was low risk. For those with an RN516, the drive numbers run from top (1 or sda) to bottom (6 or sdf).
For others, moral of the story:
- Make sure you setup SSH access to the unit before you have any issues. Without it, debug of this problem would have been very difficult. Downloading logs from an unresponsive web UI was not viable.
- If the NAS share performance is slow and the web UI is very unresponsive, use smartctl to see if one of the disks is sick. It appears that a sick but not dead drive in a RAID configuration will give these symptoms.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: RN516 slow share response, slow reboot, web UI times out - suggestions for debugging
I'd add that I find the maintenance schedule feature to be useful - you'll find it in the volume settings wheel.
There are four functions: balance, scrub, disk test, and defrag. I schedule one test each month (cycling through them all 3x a year). Though I think you could skip defrag (which has some drawbacks mixed in benefit).
Even with a maintenance schedule, it is useful to check the SMART info for the disks fairly regularly. Unfortunately the NAS alerts often come much later than I like (i.e., Netgear's health thresholds aren't the same as the ones I use, so Netgear often marks disks as "healthy" when I think they need replacement).
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: RN516 slow share response, slow reboot, web UI times out - suggestions for debugging
I'd also point out that in some cases, the top drive is Drive 1, but in other's Drive 0. It may also not be sda and/or other drives may not be mounted in order.