NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

TullyNYGuy's avatar
May 13, 2022
Solved

RN516 slow share response, slow reboot, web UI times out - suggestions for debugging

I have a RN516. It was running 6.9.0 when things went south. It is currently running 6.10.7.

 

Several weeks ago there was an overnight power failure at the office where the unit is. The unit was on UPS, but the failure was long enough that the UPS died. The unit powered back up after power was restored. One odd thing was that time was off by one hour, even though NTP was setup. I believe I toggled the timezone, did a reboot and toggled timezone back and time was then fixed.

 

The problem that we are seeing after this power event was occasionally slow file response, for both NFS and SMB. However, it has been getting progressively worse. In addition, shutdown and reboot take a long time – 10 to 30 minutes. It does vary a bit. In the last two days response to requests for files got really bad. Delete of a single file took 4-5 minutes. The web UI did work and was responsive.

 

There are several TB of free disk space. Setup is RAID-6. Snapshot are on and have been since 2016.

 

A reboot did not improve the share performance. I decided to upgrade firmware. The reboot for the upgrade took 30 minutes. When it completed, there were no shares, the web UI was not responding at all, but I did have SSH. From the command line, a top did not show any CPU activity more than 1% and there was a lot of free memory. I rebooted again from the command line using the command: reboot. After a long reboot, shares reappeared although they are taking 30 seconds to enumerate a directory, the web UI at least attempted to login but eventually timed out after entering username and password, and SSH still works. Again, top shows nothing unusual.

 

What steps should I take to debug this?


  • TullyNYGuy wrote:

     

    What steps should I take to debug this?


    Have you looked at disk health? I suggest using smartctl -x from the command line.

     

    Also, if you haven't downloaded the log zip file and looked in there, I suggest you do so.

     

     

5 Replies

Replies have been turned off for this discussion
  • An update: after a long delay, the web UI did show up. It is extremely slow and portions of the pages don't seem to show up. 

  • StephenB's avatar
    StephenB
    Guru - Experienced User

    TullyNYGuy wrote:

     

    What steps should I take to debug this?


    Have you looked at disk health? I suggest using smartctl -x from the command line.

     

    Also, if you haven't downloaded the log zip file and looked in there, I suggest you do so.

     

     

    • TullyNYGuy's avatar
      TullyNYGuy
      Tutor

      Using the tips from StephenB I found that /dev/sdc had about 100 reallocated sectors and the count was increasing by about 1 every two hours. It also failed a read test started from smartctl. Apparently, that one drive was sick enough to destroy the performance of the RN516, but it was not dead. Since the unit was set up as RAID 6, all I did was pull the sick drive out. There was a hot spare already in the NAS. The unit detected that a drive had failed and started to resync the array. Share performance improved hugely even as the rebuild was running. The web UI became responsive again. 

       

      FYI, full backups are run nightly on this NAS, in addition to a full mirror to an RN204 every 12 hours, so pulling the drive was low risk. For those with an RN516, the drive numbers run from top (1 or sda) to bottom (6 or sdf).

       

      For others, moral of the story:

      • Make sure you setup SSH access to the unit before you have any issues. Without it, debug of this problem would have been very difficult. Downloading logs from an unresponsive web UI was not viable.
      • If the NAS share performance is slow and the web UI is very unresponsive, use smartctl to see if one of the disks is sick. It appears that a sick but not dead drive in a RAID configuration will give these symptoms. 
      • StephenB's avatar
        StephenB
        Guru - Experienced User

        I'd add that I find the maintenance schedule feature to be useful - you'll find it in the volume settings wheel.

         

        There are four functions: balance, scrub, disk test, and defrag.  I schedule one test each month (cycling through them all 3x a year). Though I think you could skip defrag (which has some drawbacks mixed in benefit).

         

        Even with a maintenance schedule, it is useful to check the SMART info for the disks fairly regularly.  Unfortunately the NAS alerts often come much later than I like (i.e., Netgear's health thresholds aren't the same as the ones I use, so Netgear often marks disks as "healthy" when I think they need replacement).

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More