NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
superczar
Jun 23, 2020Apprentice
Orbi system - Faulty LAN / switching design - Will NG ever fix it?
Issue - Orbi system with 2 satellites (R1, S1, S2) will randomly drop LAN communication between segments.
e.g. devices connected to S1 (both wired/wireless) won't be able to ping devices on S2 (or vice versa)
or sometimes devices on S2 won't be able to ping devices on R1 and vice versa
All devices will continue to have Internet access though.
4 years and counting yet NG haven't fixed this issue which is so blatantly bad
Here are 4 threads detailing the issue
https://community.netgear.com/t5/Orbi/Serious-Satellite-Connectivity-Bug/td-p/1303604
https://community.netgear.com/t5/Orbi/Orbi-loosing-part-of-the-network-NOT-Internet/m-p/1298723
I personally spent days trying to resolve it. Eventually called it a day by keeping all devices on 2.1.4.16 on which the onset of thsi issue is relatively delayed.
After another recurrence few days ago, decided to upgrade to 2.5.1.16 in the hope it would be fixed by now.
Guess what, the issue started to recur within 30 mins of the upgrade.
Other points to note:
- System Restart fixes the issue temporarily
- After some time, 15 mins to a few hours, the LAN comm will breakdown and require a restart again
- Occurs in both AP and router mode
- 2.1.4.16 is relatively stable
My guess - this has something to do with messed up / out of sync ARP tables between R1, S1, S2
Such a shame actually - Orbi is a brilliant system otherwise.
It's just appalling though that a fundamental networking bug with LAN switching would not be resolved in production firmware even after years
31 Replies
Sort By
This topic appears to expose my lack of sophistication in IPv4 networking. My understanding is that every device on a LAN maintains its own ARP table. When one device wants to ping another device, it looks in its own ARP table to find the MAC address, and the device receiving the ICMP request likewise looks in its own ARP table to find the MAC address that it should respond to. I do not see how any switches are involved in the process.
If the phenomenon exists when Orbi is in both router and access point mode, there must be another router present.
My first step would be to use the debug page to capture LAN traffic and see what it shows in terms of ARP and ICMP between these devices.
- superczarApprentice
I may not be a certified network engineer but I can certainly confirm that there are no other routers on the network when the Orbi is setup in router mode :)
With the orbi in AP mode, I use an Edgerouter ER-X as the router (although I also tried a TP-Link TL-470T+ as router when troubleshooting )
Also switches do maintain either an ARP table (managed switches) or a forwarding table (unmanaged switches) so that the switch knows where to send a packet if the packet is addressed to an entity on the LAN
I don't know if the switch is a managed or unmanaged one (br0 is the internal naming convention on Orbi) but it does have a crucial role to play in routing packets correctly
Let's take an example of the Orbi in AP mode with
Satellite 1 - Sat1
Satellite 2 - Sat 2
Orbi Router (In AP mode) - Orbi1
Router (upstream from Orbi1) - Router
e.g. if I have a wifi bulb (say LIFX_1) connected to Sat1 wifi and a phone (say iP_1) connected to Sat 2 wifi.
The LIFX app will try to send and receive packets from the bulb so I think the route packets will take are as follows:
LAN route (i.e the one prone to failure)
iP_1 -> Sat2 br0 -> Orbi1 br0 -> Sat1 br0 -> LIFX_1 (and vice versa)
while if I try open a webpage on ip1 then:
WAN route (works fine)
ip_1 -> Sat2 br0 -> Orbi1 br0 -> Router
The latter works fine but the former is what is prone to issues
To extend further, if I have another phone (say ip_2) connected to Sat2, i think the path would be
LAN route (works fine)
iP_2 -> Sat2 br0 -> LIFX_1 (and vice versa)
WAN route (works fine)
ip_2 -> Sat2 br0 -> Orbi1 br0 -> Router
In this scenario, both work fine
Just to add, the route in Red works fine after a restart
It's just that at some point in time, it will start to fail - the time duration could be minutes or hours (with the new firmware ) or it could be days to weeks (with 2.1.4.16) - but it will fail
What's funny is that this won't happen if I were to replace the Orbis with $20 equivalent routers setup as APs.
PS: The debug page will allow me to enable capture for WAN-LAN traffic (which works fine at all time )
The issue is with LAN-LAN traffic
Also adding another thread detailing the same issue
https://community.netgear.com/t5/Orbi/Random-ARP-Problems-w-WiFi-nodes/m-p/1799406
and the rest:
https://community.netgear.com/t5/Orbi/Serious-Satellite-Connectivity-Bug/td-p/1303604
https://community.netgear.com/t5/Orbi/Orbi-loosing-part-of-the-network-NOT-Internet/m-p/1298723
Thanks for the thorough explanation. It was the role of the other router that eluded me. ("Here" vs. "Absent")
I agree that every device with a brain maintains an ARP table, including the router and satellites, and that unmanaged switches (having no "brain") maintain only a MAC address forwarding table.
Thanks also for the tip about the LAN/WAN packet capture. Looking more closely, it appears that the Orbi captures packets that appear at the LAN or WAN interface, so a packet that remains entirely within the Orbi LAN does not. Any packet that contains the Orbi MAC address (either source or destination) or a broadcast address is captured. This makes sense. If the switch module has directed a packet out a specific LAN port, the actual "Orbi LAN" will never see it.
Seems to identify the Orbi LAN module as the critical piece in this. If the flaw is in the hardware module itself, it might not be "fixable".
My first thought was, "Aha. The MAC forwarding table is overflowing." but that just seems ridiculous. We had problems 25 years ago when our corporate network outgrew the tables in our early switches, but today? Surely whatever module this is can handle the number of MAC addresses in a typical residential installation.
The two satellites are linked to the router over WiFi or etherent?
What is the Mfr and model# of the ethernet switch in the configuration?
- superczarApprentice
So another update on this perpetual problem:
What works mostly stable - Leaving all devices on 2.1.4.16 with updates / Internet access blocked at the firewall
It's usually weeks (or a power outage) after which this situation recurs .
At each major firmware iteration, I update to see if it solves the issue but alas, it only makes it worse.
Anyway , After the latest firmware (2.5.1.32 (?) ) , I had to take the RESET route to to do a clean revert to 2.1.4.16
A few days later, I realized the guest network was left on its default off setting so I enabled the guest network and right after, the LAN segment on the RBS50 stopped responding (Internet working as expected on all devices)
So I disabled guest network and the RBS50 devices re-appeared.
I am going to leave the guest network off to see if it truly helps or if the above was just an aberration.
The switches are unmanaged TP-Link and D-Link.
they are too dumb to be the culprit
/in any case, I have run tests with temporary CAT6 running between them (which I so wish but unfortunately cannot keep)