Reply

V2.1.3.4 Bridging

V2.1.3.4 Bridging

Starting a new thread specific to my issue. So I woke up this morning to V2.1.3.4 being installed. Not only was my wireless network down but so was my wired network. Started troublshooting and noticed that my OSX network monitor was showing 25-50Mbits of sustained traffic. Fired up tcpdump and saw a TON of arp and other broadcast traffic. Looked at my switchports and they were blowing up. Started disconnecting cables one by one until I got to my Orbi and it stopped. Before today I had made no layer 1 changes to my network. 

 

Did something change in V2.1.3.4 that would cause bridging to operate differently? Maybe something realted to STP to fix the Chromecast issue? Because my current theory is that whatever is different this rev is causing a bad broadcast storm which is murdering the network. Rebooting the Orbi's fixes it - and as long as I don't muck around with it it's stable. But whenever I enable or disable daisy chain the storm seems to happen. 

 

I can telnet, sniff, etc. if someone wants to ask me to look at something else. 

 

Model: RBR50| Orbi AC3000 Tri-band WiFi (Router Only), RBS50| Orbi AC3000 Tri-band WiFi (Satellite Only)
Message 1 of 25

Re: V2.1.3.4 Bridging

And @FURRYe38 if you want me to try anything let me know. 

Message 2 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

OK, just asking. Was a full factory reset performed on both router and Satellites and setup from scratch yet?

 

Any network switches between the router and satellite?

What is the mfr and model# of the ISP modem your using?

When trying to use Daisy Chain, I presume you physically have the Satellites piggy backing of each other and not wired for back haul ?

 

I would also try this. Test with the router alone. Turn off all Satellites. Then graduate by adding one Satellite via wireless. Test for issues. Then add another.

 

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 3 of 25

Re: V2.1.3.4 Bridging

Yeah I factory reset the router via the web interface first. Then reset each satellite via the pinhole. Physically resynced each. 

 

Satellite----cat5----Switch----Orbi Router----Wireless Clients
                      /|
        Satellite----/ | 
                     Switch-----VMware Server---Firewall----internet
                       |
                       | 
           Other hardwired stuff (e.g. AppleTV, etc.)

Notes: Switches are D-Link DSG-1100-16/24 with default, unmanaged configs. ISP Cat5 goes right into firewall. No modem. 

 

Firewall is a VMware-based Sophos XG appliance. It does DHCP and has not been changed in a long time. Not doing any fancy IPS or any other strange things to internal traffic. It's basically a "dumb" router at this point. 


Not sure what you mean by "physically have the Satellites piggy backing of each other" I have both satellites connected to the same ethernet switch so that they can setup the bridge between them -- it was working in the previous firmware. Do I need to directly connect (e.g. via a coupler/barrel connector) each Orbi to make ethernet backhaul work?

 

Thanks!

 

Edit: tiny edit to better reflect network topo. 

 

Message 4 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

You might try changing the configuration to daisy chain the 2nd Satellite off the first one, instead of both connected to the same switch and see if this changes anything. Don't enable Daisy Chain on the router. Just see if adding the 2nd Sat to the back of the 1st Sat does anything different. I would also just as a quick test, take the switch out of the mix and directly connect both to the Orbi router. See if anything seen is different.

 

I have this similar network configuration. There is 2 unmanaged switched between my 1 satellite and the router. I don't have a 2nd Sat at this time so I can't test this with you.

 

Is this your switches?

http://us.dlink.com/products/business-solutions/dgs-1100-16/

If so, these are managed switches...

 

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 5 of 25

Re: V2.1.3.4 Bridging

I'll give that a try -- although this is one of those things where (this time!) I didn't change anything. There was a change within Orbi that made bridging different. I just am not sure what changed. 

 

And yes, those are managed switches -- but like I said the configs are both factory default with no changes. 

 

Thanks for the help!

Message 6 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

Something to narrow down, go direct with the Satellites to test with the router.

 

Even though using default settings, there maybe differences between these managed switches vs non manages switches that could be causing problems with newer FW. Seems to be a change in FW though.

 

Is the Orbi is running in router mode or AP mode?

 

@AmitRand @DarrenM

 

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 7 of 25

Re: V2.1.3.4 Bridging

Just connected one satellite directly to a 2nd LAN port on the router. By the time I got to the satellite it had a blue ring. Looked at router web UI and see that it still says it's backhauled to the router via 5G. Didn't reboot anything but ran a speed test via the now hard-wired sattelite and saw 700+mbits - Which means that the web UI must be out of date because I wouldn't see that speed from that satellite if it was backhaul'd via 5G. Going to do the same with the other wired satellite. 

 

So this seems to indicate that something changed with the bridging where the satellites must be directly connected and not via the switch (since it will cause a storm). I do notice that spanning tree protocol is enabled on the router bridge (brctl show) but not on the satellite bridge. I'm not sure if that has something to do with it, but the fact that the firmware update introduced a loop into my network sure makes me think that's a good clue. Hope this helps. 

Message 8 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

Need to give time to the router or clear broswer caches and refresh for the back haul to appear correctly. I rebooted the router then cleared browser cache, exited the browser then reopened and logged into the routers web page.

 

Add the 2nd Satellite direct to the router then try daisy chain behind 1st Sat.

 

I'm wondering if there is a problem with your switches...Do you have an actual non managed switches you can try? Would need to help rule this down to your managed switches or something in FW.

 

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 9 of 25

Re: V2.1.3.4 Bridging

I get what you mean about the cached UI but I've never seen any stale UI/web issues. I tried via an incognito window and same result. Next step is to reboot stuff but right now the family is watching streaming things so I'll bounce the Orbis tomorrow.

 

I have a small truly unmanaged switch I can try but again, this all started with the firmware update when I was sleeping. I was literally woken up by alerts about stuff being offline. 


I have a few ideas of some things I can try (e.g. putting all bridged traffic on a dedicated switch). I'll let you know if I find anything. 

Message 10 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

Ok, keep us posted. I'm curious about your unmanaged switch results...

 Is the Orbi is running in router mode or AP mode?

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 11 of 25

Re: V2.1.3.4 Bridging

Orbi is running in AP mode. 

 

So to recap - connecting one satellite directly to the Orbi router's LAN port (not via the switch) solves the bradocast storm/loop issue. If I plug the 2nd satellite into a switch (in an attempt to get ethernet backhaul to work), it results in a serious broadcast storm that hoses my entire network. 

 

So, I enabled telnet on the 2nd satellite and ran:

 

brctl stp br0 1

This enabled STP on the bridge interface and boom -- fixed. I am hard wired now with wifi off:

~ speedtest
Retrieving speedtest.net configuration...
Testing from Verizon Fios (x.x.x.x)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by XXXXX (XXXXXXX, XX) [12.01 km]: 7.488 ms
Testing download speed................................................................................
Download: 706.85 Mbit/s
Testing upload speed................................................................................................
Upload: 144.08 Mbit/s

Based on this I REALLY think that this is due to STP being disabled in V2.1.3.4. I'm going to look to see how to make STP persist through reboots and call it a day. Thanks for your help @FURRYe38

 

Message 12 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

I would pass this on to NG support as well to make sure they review this:

@AmitR  and @DarrenM

 

Thanks for posting your detailed info on this.

 

 

 


@packetwerkswrote:

Orbi is running in AP mode. 

 

So to recap - connecting one satellite directly to the Orbi router's LAN port (not via the switch) solves the bradocast storm/loop issue. If I plug the 2nd satellite into a switch (in an attempt to get ethernet backhaul to work), it results in a serious broadcast storm that hoses my entire network. 

 

So, I enabled telnet on the 2nd satellite and ran:

 

brctl stp br0 1

This enabled STP on the bridge interface and boom -- fixed. I am hard wired now with wifi off:

~ speedtest
Retrieving speedtest.net configuration...
Testing from Verizon Fios (x.x.x.x)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by XXXXX (XXXXXXX, XX) [12.01 km]: 7.488 ms
Testing download speed................................................................................
Download: 706.85 Mbit/s
Testing upload speed................................................................................................
Upload: 144.08 Mbit/s

Based on this I REALLY think that this is due to STP being disabled in V2.1.3.4. I'm going to look to see how to make STP persist through reboots and call it a day. Thanks for your help @FURRYe38

 


 

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 13 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

@packetwerks

It's possible there are some kind of configuration on your D-Link DGS managed switches that could be causing this.

Were you able to test a fully non managed switch? If so what was the results? What is the Mfr and model# of this non managed switch?

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 14 of 25

Re: V2.1.3.4 Bridging

I strongly don't think so simply for the fact that the switches have not been touched in over a year -- and it just so happened that the storm happened right after V2.1.3.4 was automatically installed and the fact that implmenting STP (the exact reason STP even exists) addresses this issue. It's not the managed switch doing anything. This is a classic bridge loop problem. 

 

The problem is that the satellite is a bridge that has two paths (wired and wireless backhaul) to the network which introduces the loop. This is a classic application for STP. STP tells the bridge to send the traffic via one path, not both. It does this by evaluating the "cost" (based on port speed) of the paths. This is what my satellite looks like without STP enabled (because it's directly connected to the router (the router has has STP enabled):

 

root@RBS50:/# brctl showstp br0
br0
 bridge id              8000.9c3dcfe58e21
 designated root        8000.9c3dcfe58e21
 root port                 0                    path cost                  0
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay             0.00                 bridge forward delay       0.00
 ageing time              60.00
 hello timer               0.84                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  37.93
 flags

Note the "root port" as being 0. That's because since STP is off, there is no designation on what port the traffic should go. 

 

Here is 'brctl showstp br0' on my satellite that where I manually enabled STP:

root@RBS50:~# brctl showstp br0
br0
 bridge id              8000.9c3dcfe5a281
 designated root        8000.9c3dcfe53026
 root port                 1                    path cost                  4
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay             2.00                 bridge forward delay       2.00
 ageing time              60.00
 hello timer               0.00                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                   2.68
 flags

You'll see that the root port is 1. What is port 1? It's eth1:

eth1 (1)
 port id                8001                    state                forwarding
 designated root        8000.9c3dcfe53036       path cost                  4
 designated bridge      8000.9c3dcfe53036       message age timer         18.19
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

eth1 is the ethernet backhaul connection to my switch. It also has a cost of 4 (the wireless backhaul connection has a cost of 100). So all bridge traffic will ONLY go to port 1 and not via wireless backhaul avoiding the loop/storm. This is STP in action as it should be. 

Message 15 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

Would still be interested to know if testing exhibits the same problem with a non managed switch.

 

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 16 of 25

Re: V2.1.3.4 Bridging

Ok, just used a Netgear 5 port 10/100/1000M Switch GS605 v2 unmanaged switch and saw the same behavior. Here's what I did:

 

1. Moved all in-scope ethernet cables to the GS605 switch and saw green link lights on everything. 

2. Plugged Macbook into the Orbi satellite and started tcpdump and turned off wireless

3. Rebooted Orbi router, satellite, and waiting for everything to come back up 

4. Noticed a spike in traffic (5mbits) and observed a huge amount of broadcast traffic on my Macbook's ethernet. Another server I have with snmp monitoring also saw 5mbits of traffic.

5. Telnet'd into the satellite and ran 'brctl stp br0 1' and saw broadcast traffic back to normal -- which took a while because the network was getting murdered by the flood. 

 

Hope this helps!

 

 

Message 17 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

Ok, great info.

 

Now I wonder if you revert back to prior version of FW, does this still happen? Which FW version had you been using before for reference?

 

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 18 of 25

Re: V2.1.3.4 Bridging

I had the latest version prior to v2.1.3.4 as it automatically upgraded. Whatever that was. 

 

I'm not going to revert back a rev to troubleshoot this more -- this is now NETGEAR's problem. Someone from NETGEAR can easliy figure it out by looking at their source code repo to see if STP was enabled on the bridge interface (br0) prior to 2.1.3.4. I'm (obviously) willing to tinker and help but I think I've done enough. Surely NETGEAR has a testing lab where they can do this in minutes. 

 

My only hope is that this issue somehow floats up from the community message boards to a devloper who knows exactly what I'm talking about here. The fact that an ethernet backhaul loop is trashing customer's networks because STP was seemingly left off is a pretty bad screwup. This type of issue isn't something your front-line support folks are going to be able to effectively troubleshoot. If you are a NETGEAR employee reading this, pop this thread into Slack and ask a dev to take a peek at this. 

 

STP might have been disabled for some other reason that I'm not aware of - I don't have all of NETGEAR's use cases but this sure seems like a bad bug. There is a Reddit thread here: https://www.reddit.com/r/orbi/comments/88zy2y/v2134_and_ethernet_backhaul/ and other people are impacted by this. 

Message 19 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

I would send a PM to @AmitR and @DarrenM and link this post. Help let them know about this. 

 

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 20 of 25

Re: V2.1.3.4 Bridging

Done

Message 21 of 25
FURRYe38
Guru

Re: V2.1.3.4 Bridging

I've posted this as well. Hopefully will see something come about.

 

You might ask them about v2.2.x.x Beta FW. I see one post over on SNB that one user said it resolved this backhaul issue.

 

My Setup (Cable 1Gbps/50Mbps)>CAX80 v2.1.2.1(LAG Disabled)>RBK853 v4.6.3.16
Additional NG HW: C7800/CM1100/CM1200CM2000, Orbi CBK40, CBR750, RBK50(v22), SXR30(v110), R7000(v34), R7800(v84), R7960P(v82), EX7500/EX7700, XR450(v120) and WNHDE111
Message 22 of 25
Foritus
Aspirant

Re: V2.1.3.4 Bridging

I'm also seeing the exact same behaviour. I left for a business trip and my home network totally imploded the day after I left due to Orbi auto updating and then broadcast storming. Is there any ETA on a fix? Smiley Sad

 

I use an unmanaged central switch: TP-LINK TL-SG1024D 24-Port.

Model: RBR50| Orbi AC3000 Tri-band WiFi (Router Only), RBS50| Orbi AC3000 Tri-band WiFi (Satellite Only)
Message 23 of 25
DarrenM
Sr. NETGEAR Moderator

Re: V2.1.3.4 Bridging

Hello Foritus

 

What firmware version are you on?

 

DarrenM

Message 24 of 25
Foritus
Aspirant

Re: V2.1.3.4 Bridging

Hello!

 

I was on V2.1.3.4 at the time of posting. I think my Orbi updated last night, so I'll give it a try with 2.1.4.10 tonight.

Message 25 of 25
Top Contributors
Discussion stats
  • 24 replies
  • 2910 views
  • 4 kudos
  • 4 in conversation
Announcements