NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

bmomjian's avatar
Jan 22, 2019
Solved

Occasional wireless traffic hang

I am seeing occasional hangs on my WAP network. Ssh to the WAP normally shows 95% idle and 0% isrq for top:

Tue Jan 22 23:00:01 UTC 2019
 23:00:01 up 1 day, 23:54, load average: 3.01, 3.01, 3.04
Mem: 78828K used, 176300K free, 0K shrd, 0K buff, 23848K cached
CPU:   4% usr   0% sys   0% nic  95% idle   0% io   0% irq   0% sirq
Load average: 3.01 3.02 3.05 1/66 15057
  PID  PPID USER     STAT   VSZ %MEM CPU %CPU COMMAND
15057 15054 root     R     1408   1%   0   5% top -b -n1
 1405  1179 root     S    13876   5%   1   0% /usr/sbin/snmpd -f -c /tmp/snmpd.c
 1179     1 root     S    12044   5%   0   0% /usr/sbin/dman
 1172     1 root     S     4992   2%   0   0% /usr/sbin/mapd

 

Since the hang 20 minutes ago, there almost no WAP network traffic, but ssh to the WAP shows that top has 50% idle and a 50% sirq (ksoftirqd) value:

 

Tue Jan 22 23:20:45 UTC 2019
23:20:45 up 2 days, 15 min, load average: 4.02, 4.04, 3.74
Mem: 78904K used, 176224K free, 0K shrd, 0K buff, 23872K cached
CPU: 0% usr 0% sys 0% nic 50% idle 0% io 0% irq 50% sirq
Load average: 4.03 4.04 3.74 2/66 19837
PID PPID USER STAT VSZ %MEM CPU %CPU COMMAND
9 2 root RW 0 0% 1 50% [ksoftirqd/1]
1405 1179 root S 13876 5% 0 0% /usr/sbin/snmpd -f -c /tmp/snmpd.c
1179 1 root S 12044 5% 0 0% /usr/sbin/dmand
1172 1 root S 4992 2% 0 0% /usr/sbin/mapdd


Any idea why this is happening?   It is happening regularly.  I am on firmware 3.9.0.3 and recently did a factory default reset.

 

  • I am happy to report that firmware 3.9.1.0 has been released, and I think it fixes the problem reported in this thread.  The release notes document is dated August 13, 2019, mention these two fixes:

     

        1. Addressed intermittent, rarely encountered access point hang issues.
        2. Fixed various stability and connectivity issues.

     

    I think this closes the issue.  I will keep my monitoring in place for another year to verify the fix.  Thanks to all who helped.

     

36 Replies

  • As an update to this report, the 50% sirq continued for 17 hours, until I rebooted the WAP.  You can see the dramatic change at exactly 2 days of uptime in these hourly 'top' reports:

     

    CPU:   0% usr   4% sys   0% nic  95% idle   0% io   0% irq   0% sirq
    CPU:   0% usr   4% sys   0% nic  95% idle   0% io   0% irq   0% sirq
    CPU:   0% usr   4% sys   0% nic  95% idle   0% io   0% irq   0% sirq
    CPU:   0% usr   4% sys   0% nic  95% idle   0% io   0% irq   0% sirq
    CPU:   0% usr   4% sys   0% nic  95% idle   0% io   0% irq   0% sirq
    CPU:   4% usr   0% sys   0% nic  95% idle   0% io   0% irq   0% sirq
    CPU:   0% usr   0% sys   0% nic  50% idle   0% io   0% irq  50% sirq
    CPU:   0% usr   0% sys   0% nic  50% idle   0% io   0% irq  50% sirq
    CPU:   0% usr   4% sys   0% nic  45% idle   0% io   0% irq  50% sirq
    CPU:   0% usr   0% sys   0% nic  50% idle   0% io   0% irq  50% sirq
    CPU:   0% usr   0% sys   0% nic  50% idle   0% io   0% irq  50% sirq
    CPU:   4% usr   4% sys   0% nic  40% idle   0% io   0% irq  50% sirq
    

    After 17 hours of 50% sirq but before the WAP reboot, I disconnected every wifi device from the WAP, and verified there were no connected devices from the WAP dashboard, but the WAP was still showing 50% sirq.  I just rebooted the WAP and it is back to 0% sirq, and not slow.

     

    I will keep monitoring 'top' after the reboot and get an alert if sirq% gets high.  I am curious to see if it gets a high sirq% at exactly two days of uptime again. Does something special happen to the WAP at two days of uptime?  I could automatically reboot the WAP when the sirq% gets high, but that hardly seems like a clean fix.

    • schumaku's avatar
      schumaku
      Guru - Experienced User

      Looks very familiar - WAC730, 3.9.0.3, ... slow or intermittent wireless access only after some uptime

       

      WAC730-1# uptime
      00:26:48 up 3 days, 1:47, load average: 4.07, 4.08, 4.06


      WAC730-1# top

      Mem: 77560K used, 177568K free, 0K shrd, 0K buff, 20776K cached
      CPU: 0% usr 0% sys 0% nic 49% idle 0% io 0% irq 50% sirq
      Load average: 4.03 4.04 4.05 3/70 4210
      PID PPID USER STAT VSZ %MEM CPU %CPU COMMAND
      9 2 root RW 0 0% 1 47% [ksoftirqd/1]

      ...

      • bmomjian's avatar
        bmomjian
        Guide

        Ah, interesting.  I have a Debian server with ssh access to the WAP so I can monitor the 'top' output and get an alert when sirq gets high.  Once it happens again, I will try reverting to a previous firmware, maybe 3.8.3.0, and see if it happens again. 

         

        My family has been complaining about wifi hangs and disconnects for about six months, and that matches the time I installed the 3.9.0.3 firmware.  It will take me perhaps another week to come to a conslusion on this.  I will keep reporting on my progress.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More