× NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
× Introducing the new Orbi 770 Series Mesh System. To learn more click here.
Orbi WiFi 7 RBE973
Reply

WNDAP350's are constantly rebooting

RSchwein
Aspirant

WNDAP350's are constantly rebooting

WNDAP350 significant problems

I have 5 of these on various network segments over a large geographical area. All are using different wiring, different network switches, nothing in common with any of them except one thing. All are experiencing exactly the same problem.

Log excerpt

Mar 21 11:41:51 hostapd: Network Integrality: Ping Failed

Mar 21 11:41:51 hostapd: Network Integrality: Host: 10.110.255.254 is down.Bringing down all the vaps

Mar 21 11:41:51 kernel: brtrunk: port 2(wifi0vap0) entering disabled state

Mar 21 11:41:51 hostapd: wifi0vap0: STA 00:1b:63:cc:f5:40 IEEE 802.11: disassociated

Mar 21 11:41:51 hostapd: wifi0vap0: STA 00:26:4a:c1:51:26 IEEE 802.11: disassociated



Here is what I've learned:

When there are no associations at all (over a weekend for example) there are no problems; pings are always successful.

After there are a couple (two or so) associations then the above occurs maybe about once every 4 hours. You start getting more associations then the “Ping Failed” occurs more frequently. On one where there are about 30 associations the failure occurs about every 20 minutes, sometimes more frequently.

I am aware that these 350s ping their respective default gateways. Examination of the router's respective logs show the routers are working flawlessly. Indeed, everyone “wired” on the various network segments are not having any problems. Each 350 has it's own Gig/sec connection. Measured network traffic is rarely over 3% of available capacity.

So, in the 350's firmware, is it waiting a shorter and shorter amount of time for a ping return as the number of associations increase? Therefore, it is timing out waiting for the return when if it would wait just a little longer it would be satisfied?

I've got to solve this problem before I can deploy anymore 350s. I'm not replacing anymore WAP54G's which are working just fine.

Thanks for any help here.

Bob
Message 1 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

This morning I replaced one of the WNDAP350s with a Cisco Aironet 3500. Been watching it all day. 36 connections and it's been working great.

So, right now my WAP54G's are working fine, the 3500 is working fine, and the Netgear 350's are periodically turning off their radios because they think they're not connected to a network.

Hmmmm.... I wonder where the problem lies?
Message 2 of 46
overlook
Tutor

Re: WNDAP350's are constantly rebooting

hi,

Sorry i think i can't help you; just to say we have many WNDAP350 (firmware 2.0.9) and we never had your issue.
They work great (with bugs 😄 but nothing serious)

Did you call the support ?

Hope you'll find the way out 😉
Message 3 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

Yea... I'm running the latest firmware also.
I just replace another one with an Apple Airport. The Airport has been working flawlessly all day.
So now..
All the Linsys's have always been working well.
The Aironet has been working flawlessly.
The Apple Airport has been working without problems all day.
Just the 350's are periodically deciding they are not wired to the network and then turning of their radios. And, the higher number of associations, the more frequently they decide they don't have a wired connection.

I appreciate you relaying your experience with them.

Not contacted support yet. Wanted to make sure of the facts before I talked to Level 1/2 support.

Bob
Message 4 of 46
overlook
Tutor

Re: WNDAP350's are constantly rebooting

Since we have our WNDAP350 we had never need to reboot or something like that until we have upgraded firmwares ;).
We use lot of SSID and 2,4ghz/5ghz together.They are wired to a Netgear GSM7224v2. Jumbo disabled for AP. We also have 1 old Cisco Aironet 1131AG. All work flawlessly.
Did you try Wireshark to see what's going on ? A reset factory ? With another switch ?

😉
Message 5 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

I should point out that I believed my 350's were working fine. The users never really reported any problems other then usual “sometime slow printing” or, “my Email client required I re-authenticate to the Email server”; things that could be solved by updating out of date drivers or stop wandering between cells, etc. Most all users attributed their problems to old operating systems, not enough memory, or this is just the way the laptops run. It's the OS; just reboot and everything worked fine.

When looking at the 350's logs I would see nothing wrong. At that time I didn't really pay attention to the fact that the logs only kept about 10 minutes of data.

The way I found out that these units had problems is from my network management software. It checks to see if all the units on my network are alive by pinging them every 10 seconds. I usually do not check it's logs because I see everything in the green all the time. However, one day I did check the logs and saw that the 350's were frequently not there then reappearing in about 30 seconds or so. When looking at the 350 logs I didn't see anything because the problem had already scrolled out of the buffer. It was a cat-and-mouse game for several days until I caught a 350 going dark. I immediately checked it's log and found the “ping failure”. I saved the log. After doing a lot more legwork I picked up on the fact that the problem was more severe as the number of associations increased. And, the problem was only associated with the 350's.

So, like I said, I hadn't been made aware from my users that there were any kind of problems that I could associate with anything other then normal under provisioned hardware for the kind of heavy duty apps. they were trying to use.

It's quite possible others have this problem and never know it.

Just some thoughts.

Bob
Message 6 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

It's not a switch issue. These 350's are on different parts of the network with different switches and different routers. They are pinging their default gateway (all different) and for no apparent reason deciding they are not wired to the network. They turn off their radios. In about 30 seconds (give or take) they successfully ping their gateway and turn everything back on. Users who are browsing the Internet never know it's happening. Users who are connected to file servers never know the wireless went down and then came back up because they are reconnected to their resources in the background. Yes, I did a wireshark capture. The default gateways are replying. The 350's are just ignoring them. Since the problem is worse when the 350 has more associations I'm guessing it's a timing issue. I don't see jumbo frames being a factor with the 350 pinging and listening for a reply.
Message 7 of 46
overlook
Tutor

Re: WNDAP350's are constantly rebooting

hummm i see...
Do you have enabled "detecting rogue AP" ?
Do not use that, it's a sh.... :eek:
We use HostMonitor here to check our network. I have added ping test every 10s for our 350 AP. Let's see if your issue occurs here too. :confused:

You have talked about the AP log (buffer log, 10min, etc...) but here, the system log has never worked ! Nothing appears in the window ! We called the support and they said that this window shows only important events like reboot, modify settings...but for us : nothing ! 😄
Fortunately we use kiwi syslog server and we can see all events coming from 350 APs. 😉
Maybe you can post here a print screen from your system log ?
Message 8 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

There's a 350 log excerpt in my first post. Ping fails, the 350 disassociates everybody, then turns off it's radios. Sort time later the ping is successful and turns everything back on. With everyone disassociated it appears it now has time to actually hear the echo reply.
Message 9 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

overlook237 wrote:
hummm i see...
Do you have enabled "detecting rogue AP" ?
Do not use that, it's a sh.... :eek:
We use HostMonitor here to check our network. I have added ping test every 10s for our 350 AP. Let's see if your issue occurs here too. :confused:

You have talked about the AP log (buffer log, 10min, etc...) but here, the system log has never worked ! Nothing appears in the window ! We called the support and they said that this window shows only important events like reboot, modify settings...but for us : nothing ! 😄
Fortunately we use kiwi syslog server and we can see all events coming from 350 APs. 😉
Maybe you can post here a print screen from your system log ?


And another note, all my 350 logs are showing everything, associations, disassociation, encryption key exchanges, etc., etc. That's why I had to work so hard to capture the moment the ping failed. The buffer is so small.
Message 10 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

Yes, rogue is on. Problems? I'll turn it off on some to see if it makes a difference.
Message 11 of 46
overlook
Tutor

Re: WNDAP350's are constantly rebooting

yes it can be ...
Here when rogue was set to on users were still complaining about erratic wifi performance and drop-offs !
When rogue is on the AP scan your network continuously to find another unwanted (or not) APs. When set to off, connections work flawlessly 😉

For the 350 system log i have found why there is nothing here ! We have configured APs to use a syslog server. If i disable the syslog feature all logs are showing in the system log window ! Humm Netgear support : :mad: Netgear never told me that...
But, if you can, use a syslog server (kiwi or another soft of your choice) rather than the Ap buffer; there is no log size limit, you can archive, email, etc....

Since 1 year we have these APs all logs are archived. I have searched for "entering disabled state" keywords and i have only found that each time it was disabled state it was for technical reason and not an issue.

Try with rogue set to off and tell us if it's (better) work or not 😄
Message 12 of 46
Glith
Aspirant

Re: WNDAP350's are constantly rebooting

Try and disable DFS on the 5Ghz radio....
I found a bug where the DFS got stuck in a loop and disabled all network traffic.
Message 13 of 46
overlook
Tutor

Re: WNDAP350's are constantly rebooting

our ping test results : 2,4 and 5ghz are always UP for all APs 😉
Message 14 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

Glith wrote:
Try and disable DFS on the 5Ghz radio....
I found a bug where the DFS got stuck in a loop and disabled all network traffic.



Cisco's are still working fine.
Linksys's are still working fine.
Apple Airport still working fine.
Netgear 350's still turning off their radios. For an experiment I put two 350's onto the same network segment about 100' apart with low power settings and different channels. Of course both were pinging the same router. I watched them for a while. After about an hour one had a ping failure and disconnected. But the other was doing just fine. After about a minute it came back up. A little while later the second had a ping failure and disconnected. However the first was still fine. Everything I've done seems to point conclusively to something in the 350 having problems.

I've just turned off all the 5 Ghtz radios. We'll see.
Message 15 of 46
overlook
Tutor

Re: WNDAP350's are constantly rebooting

Faulty hardware series :confused:
To be sure call your Netgear tech support and ask for a RMA for 1 AP and test.

😉
Message 16 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

overlook237 wrote:
Faulty hardware series :confused:
To be sure call your Netgear tech support and ask for a RMA for 1 AP and test.

😉


Now that I know where to look I've been packet capturing with a focus. Comparing what I capture with the 350's logs shows that the 350 and reality are not collocated.

When the 350 goes off line it's logs show that it is pinging and not getting a reply. However, packet capture shows that all pinging stops. Indeed, during the 350's blackout someone on the network can not ping the 350. It appears to have turned off its Ethernet port. Yet, it's log shows that it believes it is connected to the network and it continues to ping.

Firmware; Hardware? I haven't called Netgear yet; would I be the first to surface this problem?
Message 17 of 46
Glith
Aspirant

Re: WNDAP350's are constantly rebooting

Try to disable Auto channel if you have it set on any radio.
Message 18 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

Glith wrote:
Try to disable Auto channel if you have it set on any radio.


They are all set to Auto channel. Heck, I've pretty much disabled everything else, why not this? I'll try it.

In the mean time - below is the relevant part of the 350's log:

(everything fine)

Mar 28 14:21:56 hostapd: Network Integrality: Ping Success
Mar 28 14:22:50 hostapd: Network Integrality: Ping Failed
Mar 28 14:22:50 hostapd: Network Integrality: Host: 10.100.255.254 is down.Bringing down all the vaps
Mar 28 14:22:50 kernel: brtrunk: port 2(wifi0vap0) entering disabled state
Mar 28 14:22:50 hostapd: wifi0vap0: STA 00:1f:f3:bd:f6:c3 IEEE 802.11: disassociated
Mar 28 14:22:50 hostapd: wifi0vap0: STA 00:16:ea:c3:44:b2 IEEE 802.11: disassociated

(kicks everyone off)

Mar 28 14:22:58 hostapd: Network Integrality: Ping Failed
Mar 28 14:23:01 hostapd: Network Integrality: Ping Failed
Mar 28 14:23:01 hostapd: Network Integrality: Host: 10.100.255.254 is down.Bringing down all the vaps
Mar 28 14:23:06 hostapd: Network Integrality: Ping Failed
Mar 28 14:23:09 hostapd: Network Integrality: Ping Failed
Mar 28 14:23:14 hostapd: Network Integrality: Ping Failed



Mar 28 14:35:04 hostapd: Network Integrality: Ping Failed
Mar 28 14:35:08 hostapd: Network Integrality: Ping Failed
Mar 28 14:35:11 kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar 28 14:35:11 kernel: ag7100_tx_timeout
Mar 28 14:35:11 kernel: ag7100_ring_free Freeing at 0x803ec000
Mar 28 14:35:11 kernel: ag7100_ring_free Freeing at 0x86a6c000
Mar 28 14:35:11 kernel: ag7100_ring_alloc Allocated 4800 at 0x803ec000
Mar 28 14:35:11 kernel: ag7100_ring_alloc Allocated 3024 at 0x86a6c000
Mar 28 14:35:11 kernel: AG7100: cfg1 0xf cfg2 0x7215
Mar 28 14:35:11 kernel: VSC8601: Found 0 unit 0:0 phy_addr: 1 id: 004dd04e
Mar 28 14:35:11 kernel: VSC8601: PHY is an Atheros F1E
Mar 28 14:35:11 kernel: VSC8601: unit 0 phy_addr 1
Mar 28 14:35:11 kernel: Writing 4
Mar 28 14:35:11 kernel: brtrunk: port 1(eth0) entering disabled state
Mar 28 14:35:12 hostapd: Network Integrality: Ping Failed
Mar 28 14:35:15 kernel: VSC8601: unit 0 phy_addr 1
Mar 28 14:35:15 kernel: AG7100: unit 0 phy is up...RGMii 1000Mbps full duplex
Mar 28 14:35:15 kernel: AG7100: pll reg 0x18050010: 0x11110000 AG7100: cfg_1: 0x1ff0000
Mar 28 14:35:15 kernel: AG7100: cfg_2: 0x3ff
Mar 28 14:35:15 kernel: AG7100: cfg_3: 0x18001ff
Mar 28 14:35:15 kernel: AG7100: cfg_4: 0xffff
Mar 28 14:35:15 kernel: AG7100: cfg_5: 0xfffef
Mar 28 14:35:15 kernel: AG7100: done cfg2 0x7215 ifctl 0x0 miictrl 0x22
Mar 28 14:35:15 kernel: brtrunk: port 1(eth0) entering learning state
Mar 28 14:35:16 hostapd: Network Integrality: Ping Failed
Mar 28 14:35:20 kernel: brtrunk: topology change detected, propagating
Mar 28 14:35:20 kernel: brtrunk: port 1(eth0) entering forwarding state
Mar 28 14:35:20 hostapd: Network Integrality: Ping Failed
Mar 28 14:35:21 hostapd: Network Integrality: Ping Success
Mar 28 14:35:21 hostapd: Network Integrality: Host: 10.100.255.254 is UP. Bringing up all the vaps
Mar 28 14:35:21 kernel: brtrunk: port 2(wifi0vap0) entering learning state
Mar 28 14:35:25 hostapd: Network Integrality: Ping Success
Mar 28 14:35:25 hostapd: Network Integrality: Host: 10.100.255.254 is UP. Bringing up all the vaps
Mar 28 14:35:26 kernel: brtrunk: topology change detected, propagating
Mar 28 14:35:26 kernel: brtrunk: port 2(wifi0vap0) entering forwarding state
Mar 28 14:35:32 hostapd: wifi0vap0: STA 60:fb:42:6d:6d:3e IEEE 802.11: associated
Mar 28 14:35:43 hostapd: wifi0vap0: STA 00:1f:f3:bd:f6:c3 IEEE 802.11: associated

Message 19 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

Note that the 350 was down for over 10 minutes (between 14:23 and 14:35). All the while it was pinging away about every 4 seconds even though it had disabled it's own Ethernet port!
Message 20 of 46
bserrato
Aspirant

Re: WNDAP350's are constantly rebooting

I'd like to know if you were able to resolve this. I'm having similar problems except I'm finding the log is not actively updating itself. I did see that at some point in the past the AP put all wireless connections into promiscuous mode then eth0 into promiscuous mode. I'm only using the 5GHz radio and switched from 40MHz to 20MHz today, but the painful pauses in connectivity continue.

Might any of you have any suggestions?
Message 21 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

Tomorrow I'm going to disable the auto channeling. I've already turned off the 5 GHtz radio, the rouge detection, ..... I've not unplugged them yet but, that just may be my last step. Well, selling them on Ebay may be the last step.:(
Message 22 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

Well, I turned off auto channeling on all of them. I set them to least congested frequencies in their areas. About 1 hour later one of them went down. Same problem. Ping failure, turn off their Ethernet port but, it still continues to ping as if it's still connected even though it's turned off it's port. About 11 minutes later some kind of watchdog kicked in, cleared everything up, it turned it's Ethernet port back on and guess what..... ping success.

About the only thing left to turn off is the device it self by unplugging it.

By the way, this is not happening on just one 350. I have five of them and they are all doing the same thing.
Message 23 of 46
RSchwein
Aspirant

Re: WNDAP350's are constantly rebooting

Just finished beta testing a firmware upgrade. Appears to have solved the problem.

Bob
Message 24 of 46
overlook
Tutor

Re: WNDAP350's are constantly rebooting

RSchwein wrote:
Just finished beta testing a firmware upgrade. Appears to have solved the problem.

Bob


hi RSchwein

Could you tell me what beta release you have tested please ?

thank you
Message 25 of 46
Discussion stats
  • 45 replies
  • 18122 views
  • 0 kudos
  • 14 in conversation
Announcements

Orbi 770 Series