- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
Re: Occasional wireless traffic hang
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am seeing occasional hangs on my WAP network. Ssh to the WAP normally shows 95% idle and 0% isrq for top:
Tue Jan 22 23:00:01 UTC 2019 23:00:01 up 1 day, 23:54, load average: 3.01, 3.01, 3.04 Mem: 78828K used, 176300K free, 0K shrd, 0K buff, 23848K cached CPU: 4% usr 0% sys 0% nic 95% idle 0% io 0% irq 0% sirq Load average: 3.01 3.02 3.05 1/66 15057 PID PPID USER STAT VSZ %MEM CPU %CPU COMMAND 15057 15054 root R 1408 1% 0 5% top -b -n1 1405 1179 root S 13876 5% 1 0% /usr/sbin/snmpd -f -c /tmp/snmpd.c 1179 1 root S 12044 5% 0 0% /usr/sbin/dman 1172 1 root S 4992 2% 0 0% /usr/sbin/mapd
Since the hang 20 minutes ago, there almost no WAP network traffic, but ssh to the WAP shows that top has 50% idle and a 50% sirq (ksoftirqd) value:
Tue Jan 22 23:20:45 UTC 2019 23:20:45 up 2 days, 15 min, load average: 4.02, 4.04, 3.74 Mem: 78904K used, 176224K free, 0K shrd, 0K buff, 23872K cached CPU: 0% usr 0% sys 0% nic 50% idle 0% io 0% irq 50% sirq Load average: 4.03 4.04 3.74 2/66 19837 PID PPID USER STAT VSZ %MEM CPU %CPU COMMAND 9 2 root RW 0 0% 1 50% [ksoftirqd/1] 1405 1179 root S 13876 5% 0 0% /usr/sbin/snmpd -f -c /tmp/snmpd.c 1179 1 root S 12044 5% 0 0% /usr/sbin/dmand 1172 1 root S 4992 2% 0 0% /usr/sbin/mapdd
Any idea why this is happening? It is happening regularly. I am on firmware 3.9.0.3 and recently did a factory default reset.
Solved! Go to Solution.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am happy to report that firmware 3.9.1.0 has been released, and I think it fixes the problem reported in this thread. The release notes document is dated August 13, 2019, mention these two fixes:
1. Addressed intermittent, rarely encountered access point hang issues.
2. Fixed various stability and connectivity issues.
I think this closes the issue. I will keep my monitoring in place for another year to verify the fix. Thanks to all who helped.
All Replies
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
As an update to this report, the 50% sirq continued for 17 hours, until I rebooted the WAP. You can see the dramatic change at exactly 2 days of uptime in these hourly 'top' reports:
CPU: 0% usr 4% sys 0% nic 95% idle 0% io 0% irq 0% sirq CPU: 0% usr 4% sys 0% nic 95% idle 0% io 0% irq 0% sirq CPU: 0% usr 4% sys 0% nic 95% idle 0% io 0% irq 0% sirq CPU: 0% usr 4% sys 0% nic 95% idle 0% io 0% irq 0% sirq CPU: 0% usr 4% sys 0% nic 95% idle 0% io 0% irq 0% sirq CPU: 4% usr 0% sys 0% nic 95% idle 0% io 0% irq 0% sirq CPU: 0% usr 0% sys 0% nic 50% idle 0% io 0% irq 50% sirq CPU: 0% usr 0% sys 0% nic 50% idle 0% io 0% irq 50% sirq CPU: 0% usr 4% sys 0% nic 45% idle 0% io 0% irq 50% sirq CPU: 0% usr 0% sys 0% nic 50% idle 0% io 0% irq 50% sirq CPU: 0% usr 0% sys 0% nic 50% idle 0% io 0% irq 50% sirq CPU: 4% usr 4% sys 0% nic 40% idle 0% io 0% irq 50% sirq
After 17 hours of 50% sirq but before the WAP reboot, I disconnected every wifi device from the WAP, and verified there were no connected devices from the WAP dashboard, but the WAP was still showing 50% sirq. I just rebooted the WAP and it is back to 0% sirq, and not slow.
I will keep monitoring 'top' after the reboot and get an alert if sirq% gets high. I am curious to see if it gets a high sirq% at exactly two days of uptime again. Does something special happen to the WAP at two days of uptime? I could automatically reboot the WAP when the sirq% gets high, but that hardly seems like a clean fix.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Looks very familiar - WAC730, 3.9.0.3, ... slow or intermittent wireless access only after some uptime
WAC730-1# uptime
00:26:48 up 3 days, 1:47, load average: 4.07, 4.08, 4.06
WAC730-1# top
Mem: 77560K used, 177568K free, 0K shrd, 0K buff, 20776K cached
CPU: 0% usr 0% sys 0% nic 49% idle 0% io 0% irq 50% sirq
Load average: 4.03 4.04 4.05 3/70 4210
PID PPID USER STAT VSZ %MEM CPU %CPU COMMAND
9 2 root RW 0 0% 1 47% [ksoftirqd/1]
...
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Ah, interesting. I have a Debian server with ssh access to the WAP so I can monitor the 'top' output and get an alert when sirq gets high. Once it happens again, I will try reverting to a previous firmware, maybe 3.8.3.0, and see if it happens again.
My family has been complaining about wifi hangs and disconnects for about six months, and that matches the time I installed the 3.9.0.3 firmware. It will take me perhaps another week to come to a conslusion on this. I will keep reporting on my progress.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Another odd one - skipping a MAC address without any indication in the product logs:
WAC730-1# dmesg
processpmq: skip entry with mc/bc address 41:4e:36:84:48:6e
wl1: wlc_bmac_processpmq: skip entry with mc/bc address 41:4e:36:84:48:6e
wl1: wlc_bmac_processpmq: skip entry with mc/bc address 41:4e:36:84:48:6e
...
It's a valid device, a HTC phone.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Priority should be on the sirq issue started by the OP. That MAC message poped up while peek-and-poke the system (silly me - after a reboot). Will capture the logs ans share by PM.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
@bmomjian wrote:
Ah, interesting. I have a Debian server with ssh access to the WAP so I can monitor the 'top' output and get an alert when sirq gets high. Once it happens again, ...
Before reverting, go to Monitoring -> Logs -> Save As ... and put the .tar to any cloud share, and seend the link to @RaghuHR for Netgear inspection.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
OK, I will grab the logs once it happens again, though I have sent 50% sirq logs to you before and you said it looked fine.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Last time I got the 50% sirq at exactly 48 hours of uptime. I have passed 48 hours since the recent reboot and the sirq% is still zero. I will report back and grab the logs as soon as my logging informs me that sirq has increased.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Mem: 77772K used, 177356K free, 0K shrd, 0K buff, 20872K cached CPU: 0% usr 0% sys 0% nic 49% idle 0% io 0% irq 50% sirq Load average: 4.00 4.01 4.05 6/71 7314 PID PPID USER STAT VSZ %MEM CPU %CPU COMMAND 9 2 root RW 0 0% 1 50% [ksoftirqd/1] 1111 1 root S 11656 5% 0 0% /usr/sbin/dman 1518 1111 root S 13880 5% 0 0% /usr/sbin/snmpd -f -c /tmp/snmpd.conf udp:161,udp6:161 -LO 0 1626 1111 root R 7196 3% 0 0% /usr/sbin/cportald 1104 1 root R 4984 2% 0 0% /usr/sbin/mapd 1597 1111 root S 4496 2% 0 0% /usr/bin/mini_httpd-ssl -S -E /etc/mini_httpd.pem -D -p1 80 -p2 443 -t 5 -s 1 -u root -c ./*.cgi|*.cgi|cgi-bin/ 1593 1111 root S 4440 2% 0 0% /usr/bin/mini_httpd-ssl -D -p1 80 -p2 443 -t 5 -s 1 -u root -c ./*.cgi|*.cgi|cgi-bin/* -n 20 -Y ALL:!aNULL:!eNU 1541 1 root R 4312 2% 0 0% /usr/sbin/dhcpdump brtrunk 1415 1111 root S 4228 2% 0 0% /usr/sbin/mapqosd 986 1 root S 3736 1% 0 0% /usr/sbin/tspec 1542 1111 root S 2244 1% 0 0% /usr/sbin/hostapd /tmp/hostapd.conf.wlan0 /tmp/hostapd.conf.wlan1 1073 1 root S 1696 1% 0 0% /usr/sbin/asengd 1416 1111 root S 1540 1% 0 0% /usr/sbin/lldpd 8774 8770 root S 1488 1% 0 0% -splash 1066 1 root S 1424 1% 0 0% insmod /lib/modules/2.6.36.4/extra/wlext_ae.ko 1427 1111 root S 1416 1% 0 0% /sbin/udhcpc -l /tmp/udhcpc.lease.brtrunk --foreground -i brtrunk -H WAC730-1 -s /usr/share/udhcpc/dhcp_client. 1 0 root S 1412 1% 0 0% init 15908 8774 root R 1412 1% 0 0% top 1133 1111 root S 1412 1% 0 0% /sbin/getty -L 115200 ttyS0 940 1 root S 1400 1% 0 0% insmod /lib/modules/2.6.36.4/extra/wlext_wds.ko
Lucky me had to move the WAC730 away from the Insight switch to an elderly GS510TP in attempting to check if the latent IGMP Multicast issues (still can't get any Apple Drive Bonjour announcements e.g. for Time Machine) are caused by the new Insight switch firmware 1.0.4.16 or the WAC505/510 on 5.0.10.2 ... sorry for the slightly off topic, but now the 50% sirq is obviously gone again for a while.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
That looks just like mine, exactly 50%. How long has this WAP been up? Also, is this from a WAC730? What firmware version?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Same WAC730 as above again, up for a few days, 3.0.9.3 as before.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Well, the most recent time happened was after 48 hours, but now I am up for 5 days, 3:34 and still 0% sirq. I am monitoring hourly. Can you do the dump requested earlier in the thread and email it to him? I don't know when my failure is going to happen again.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
OK, please let us know if you see it again.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Uh, I made the same mistake last time too. 😞 We will get this!
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
@bmomjian wrote:
Uh, I made the same mistake last time too. 😞 We will get this!
We're lucky 8-)
Mem: 80920K used, 174208K free, 0K shrd, 0K buff, 22372K cached CPU: 0% usr 0% sys 0% nic 49% idle 0% io 0% irq 50% sirq Load average: 4.00 4.01 4.05 2/70 6976 PID PPID USER STAT VSZ %MEM CPU %CPU COMMAND 9 2 root RW 0 0% 1 50% [ksoftirqd/1] 1540 1 root S 4312 2% 0 0% /usr/sbin/dhcpdump brtrunk 1518 1111 root S 13880 5% 0 0% /usr/sbin/snmpd -f -c /tmp/snmpd.conf udp:161,udp6:161 -LO 0 1111 1 root S 11996 5% 0 0% /usr/sbin/dman 1625 1111 root S 7196 3% 0 0% /usr/sbin/cportald 1104 1 root S 4984 2% 0 0% /usr/sbin/mapd 1596 1111 root S 4496 2% 0 0% /usr/bin/mini_httpd-ssl -S -E /etc/mini_httpd.pem -D -p1 80 -p2 443 -t 5 -s 1 -u root -c ./*.cgi|*.cgi|cgi-bin/ 1592 1111 root S 4440 2% 1 0% /usr/bin/mini_httpd-ssl -D -p1 80 -p2 443 -t 5 -s 1 -u root -c ./*.cgi|*.cgi|cgi-bin/* -n 20 -Y ALL:!aNULL:!eNU 1415 1111 root S 4228 2% 0 0% /usr/sbin/mapqosd 986 1 root S 3736 1% 0 0% /usr/sbin/tspec 5892 1111 root S 2252 1% 0 0% /usr/sbin/hostapd /tmp/hostapd.conf.wlan0 /tmp/hostapd.conf.wlan1 1073 1 root S 1696 1% 0 0% /usr/sbin/asengd 1416 1111 root S 1540 1% 0 0% /usr/sbin/lldpd 6893 6886 root S 1488 1% 0 0% -splash 1066 1 root S 1424 1% 0 0% insmod /lib/modules/2.6.36.4/extra/wlext_ae.ko 1427 1111 root S 1416 1% 1 0% /sbin/udhcpc -l /tmp/udhcpc.lease.brtrunk --foreground -i brtrunk -H WAC730-1 -s /usr/share/udhcpc/dhcp_client. 1 0 root S 1412 1% 0 0% init 6894 6893 root R 1412 1% 0 0% top 1133 1111 root S 1412 1% 0 0% /sbin/getty -L 115200 ttyS0 940 1 root S 1400 1% 0 0% insmod /lib/modules/2.6.36.4/extra/wlext_wds.ko 972 1 root S 1400 1% 0 0% insmod /lib/modules/2.6.36.4/extra/wlext_rfscan.ko 957 1 root D 1400 1% 0 0% insmod /lib/modules/2.6.36.4/extra/wlext_dl2tunnel.ko 891 1 root D 1400 1% 0 0% insmod /lib/modules/2.6.36.4/extra/wlext_l2tunnel.ko 892 1 root D 1400 1% 0 0% insmod /lib/modules/2.6.36.4/extra/wlext_l2tunnel.ko 6886 1342 root S 1100 0% 0 0% /usr/sbin/dropbear -E -F -d /tmp/dss.key -r /tmp/rsa.key 1342 1111 root S 1044 0% 0 0% /usr/sbin/dropbear -E -F -d /tmp/dss.key -r /tmp/rsa.key 1676 1111 root S 952 0% 0 0% /usr/sbin/sntp -s 3600 0 time-b.netgear.com 6952 1111 root S 836 0% 0 0% /usr/sbin/mDNSResponderPosix -f /tmp/bonjour_services 1512 1111 root S 716 0% 0 0% /usr/sbin/syslogd 1145 1111 root S 644 0% 0 0% /usr/bin/eapd 1045 1 root S 628 0% 0 0% /usr/bin/ifmon 1509 1 root S 628 0% 0 0% ifmon
Kernel:
WAC730-1# dmesg ndefined error) wl0: wlc_iovar_op: BCME -1 (Undefined error) wl1: wlc_iovar_op: BCME -1 (Undefined error) wl0: wlc_iovar_op: BCME -1 (Undefined error) wl0: wlc_iovar_op: BCME -1 (Undefined error) wl1: wlc_iovar_op: BCME -1 (Undefined error) ...
Let's see ...
WAC730-1# cat /proc/interrupts CPU0 CPU1 27: 36 0 GIC mpcore_gtimer 32: 0 0 GIC L2C 117: 2640 0 GIC serial 163: 0 5951497 GIC wlan0 169: 0 3683950 GIC wlan1 179: 3041369 0 GIC eth0 IPI: 70538 69585 LOC: 19178805 19043112 Err: 0 WAC730-1# cat /proc/interrupts CPU0 CPU1 27: 36 0 GIC mpcore_gtimer 32: 0 0 GIC L2C 117: 2640 0 GIC serial 163: 0 5951611 GIC wlan0 169: 0 3683950 GIC wlan1 179: 3041389 0 GIC eth0 IPI: 70538 69585 LOC: 19179335 19043607 Err: 0 WAC730-1# cat /proc/interrupts CPU0 CPU1 27: 36 0 GIC mpcore_gtimer 32: 0 0 GIC L2C 117: 2640 0 GIC serial 163: 0 5952063 GIC wlan0 169: 0 3683950 GIC wlan1 179: 3041456 0 GIC eth0 IPI: 70538 69586 LOC: 19181330 19045488 Err: 0
So let's focus on 163, 169, 179, and LOC.
WAC730-1# cat /proc/softirqs CPU0 CPU1 HI: 0 0 TIMER: 9445575 9445441 NET_TX: 81 64 NET_RX: 2104183 12945 BLOCK: 0 0 BLOCK_IOPOLL: 0 0 TASKLET: 4446053 109704001 SCHED: 9030682 7939783 HRTIMER: 0 0 RCU: 1467424 1531885 WAC730-1# cat /proc/softirqs CPU0 CPU1 HI: 0 0 TIMER: 9445734 9445600 NET_TX: 81 64 NET_RX: 2104196 12945 BLOCK: 0 0 BLOCK_IOPOLL: 0 0 TASKLET: 4446080 109735563 SCHED: 9030841 7939791 HRTIMER: 0 0 RCU: 1467443 1531908 WAC730-1# cat /proc/softirqs CPU0 CPU1 HI: 0 0 TIMER: 9445990 9445856 NET_TX: 81 64 NET_RX: 2104214 12945 BLOCK: 0 0 BLOCK_IOPOLL: 0 0 TASKLET: 4446116 109786311 SCHED: 9031088 7939803 HRTIMER: 0 0 RCU: 1467498 1531957
Hm... dropping a PM to @RaghuHR with the logs.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Great, thanks! It has not happened here since my last report.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Experiencing the high sirq load almost every other day, here at the home office as well as on your customer sites. Not nice as connectivity, reliability and customer happiness is impacted.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Got a WAC7xx Beta FW v3.9.0.15 under NDA for standalone usage so can't share (talk to @DaneA), it was requested to provide feedback in the public community.
WAC730-1# uptime
14:52:09 up 1 day, 20:17, load average: 3.00, 3.01, 3.04
No unexpected high sirq load leading to a partial DoS situation - testing and monitoring does continue.
TIA,
-Kurt
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Occasional wireless traffic hang
Were you getting high sirqs before the update? Any idea on a cause? I am monitoring sirq and have not seen any spikes since my earlier report.