× NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Orbi WiFi 7 RBE973
Reply

M4300 stacking problems

Greg_11
Aspirant

M4300 stacking problems

I'm trying to create an 8 switch spine and leaf stack with two XSM4324S (M4300-12X12F) and six GSM4352PB-100NES (M4300-52G-PoE+) and I'm having some major issues. The stack created fine with 7 of the switches, but when I added the 8th, it added to the stack, synchronized its firmware as expected, rebooted, displayed the appropriate stack unit number, and appeared as OK in the stack configuration page. However, aside from the stack ports, the rest of the ports on the switch were dead. In the device view, the stack ports were green, but the rest were greyed out. At first I thought it was a switch problem, so I swapped in a spare. Same thing. I tried new optics, same thing.

 

So far I've tried at least half a dozen SFP+ modules, and three M4300-52G-PoE+ switches. I've tried leaving the switch configured in the stack, and also deleting it. I've factory reset the switch (not the stack) from the boot menu (option 6) which works fine, but once added to the stack, same thing. If I reload the switch that's not working without the stack ports connected, it comes up fine and forwards packets, but now it's the master, along with the other master, and I get MAC/IP conflicts unless I factory reset it first.

 

At one point both spine switches got powered down, so every leaf became a stack master. When the spines came back, I now had six stack masters connected to two spine switches, each now with the same management IP, MAC, etc. The only way I could get it back to a working state was to physically power down all the switches, and bring up the spines first, and then the leaves. When it came back up, one of the leaves which was working just fine was now in the state the 8th member kept getting in with the stack ports up, but the rest down. I've tried all of the steps I tried with the 8th switch on this other previous stack member, and it still isn't working. I can get it added back to the stack just fine, but then it doesn't forward traffic.

 

The currently six switch stack has production traffic on it which I would prefer to not take down, but if I have to have yet another maintenance period, I will.

 

The buffered logs when switch 7 is part of the stack but isn't forwarding traffic (repeats):

 

<15> Mar 3 18:24:01 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114525 %% Destination unit 7 is not ready <13> Mar 3 18:24:01 MD-SW-ACCESS-1 CARDMGR[cmgrInsertTask]: cardmgr_util.c(521) 114524 %% Card insert failed for unit/slot 7/0, wait for retry <15> Mar 3 18:24:01 MD-SW-ACCESS-1 DRIVER[cmgrInsertTask]: broad_init.c(2998) 114523 %% Driver: Card insertion for unit 7, slot 0 failed. Wait for retry.

<12> Mar 3 18:24:01 MD-SW-ACCESS-1 DRIVER[cmgrInsertTask]: l7_usl_sm.c(2094) 114522 %% USL: Cold start on unit 7 failed, error code -4 <11> Mar 3 18:24:01 MD-SW-ACCESS-7 DRIVER[unitMgrTask]: l7_usl_bcm.c(1828) 114521 %% Db TRUNK(8) sync failed, error code -4 elem Trunk-id 0 App-id 1 PSC=0, DLF Index=0, MC Index=0, IPMC Index=0, LocalPref=0 Member ports in the Trunk():

<15> Mar 3 18:24:00 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114520 %% Destination unit 7 is not ready <15> Mar 3 18:23:59 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114519 %% Destination unit 7 is not ready <15> Mar 3 18:23:58 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114518 %% Destination unit 7 is not ready <15> Mar 3 18:23:57 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114517 %% Destination unit 7 is not ready <15> Mar 3 18:23:56 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114516 %% Destination unit 7 is not ready <15> Mar 3 18:23:55 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114515 %% Destination unit 7 is not ready <15> Mar 3 18:23:54 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114514 %% Destination unit 7 is not ready <15> Mar 3 18:23:53 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114513 %% Destination unit 7 is not ready <15> Mar 3 18:23:52 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114512 %% Destination unit 7 is not ready <15> Mar 3 18:23:51 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114511 %% Destination unit 7 is not ready <15> Mar 3 18:23:50 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114510 %% Destination unit 7 is not ready <15> Mar 3 18:23:49 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114509 %% Destination unit 7 is not ready <15> Mar 3 18:23:48 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114508 %% Destination unit 7 is not ready <15> Mar 3 18:23:47 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114507 %% Destination unit 7 is not ready <15> Mar 3 18:23:46 MD-SW-ACCESS-1 DRIVER[osapiTimer]: broad_hpc_stacking.c(608) 114506 %% Destination unit 7 is not ready <13> Mar 3 18:23:46 MD-SW-ACCESS-1 CARDMGR[cmgrInsertTask]: cardmgr_util.c(521) 114505 %% Card insert failed for unit/slot 7/0, wait for retry <11> Mar 3 18:23:46 MD-SW-ACCESS-7 DRIVER[unitMgrTask]: l7_usl_bcm.c(1828) 114504 %% Db TRUNK(8) sync failed, error code -4 elem Trunk-id 0 App-id 1 PSC=0, DLF Index=0, MC Index=0, IPMC Index=0, LocalPref=0 Member ports in the Trunk():

<15> Mar 3 18:23:46 MD-SW-ACCESS-1 DRIVER[cmgrInsertTask]: broad_init.c(2998) 114503 %% Driver: Card insertion for unit 7, slot 0 failed. Wait for retry.

 

 Stack Port Diagnostics for the ports used to stack the switch in question:

1/0/3 RBYT:2720 RPKT:34 TBYT:2792 TPKT:35 RFCS:0 RFRG:0 RJBR:0 RUND:1 RUNT:1 TFCS:0 TERR:0

2/0/3 RBYT:2720 RPKT:34 TBYT:2792 TPKT:35 RFCS:0 RFRG:0 RJBR:0 RUND:1 RUNT:1 TFCS:0 TERR:0

7/0/51 RBYT:113018091 RPKT:232461 TBYT:90651015 TPKT:126899 RFCS:0 RFRG:0 RJBR:0 RUND:0 RUNT:0 TFCS:0 TERR:0

7/0/52 RBYT:82228630 RPKT:220577 TBYT:57840722 TPKT:197901 RFCS:0 RFRG:0 RJBR:0 RUND:0 RUNT:0 TFCS:0 TERR:0

 

Note: I am unable to add the either model number of these switches to the "Model" field of this post. Mods, please update your tables.

Message 1 of 8
DanielZhang
NETGEAR Expert

Re: M4300 stacking problems

Hi Greg,

 

Welcome to NETGEAR community!Smiley Happy

 

 

Could you share us the simply topology about the spine and leaf stack?(the stack port connected between spine and leaf)

Which SFP+ module did you used for stack? (or DAC cable?)

Did you set the high stack priority for both spine switch? (It will make two M4300-12X-12F as master and standby all the time)

 

Please also send the diagnosis information to us as following link:

http://kb.netgear.com/app/answers/detail/a_id/31439

 

 

Message 2 of 8
Greg_11
Aspirant

Re: M4300 stacking problems

Hi again Dainel,

 

 The stack ports of the two 12x12f switches are set to Management and OprStandby with priority 15 and 14 respectively. All of the PoE switches have undefined priority and are Stack Members. The switches are connected with either DAC, CAT6 (to stack ports), SMF, or MMF. The DACs and all transcievers are NETGEAR compatible from FiberStore. Diagram below. The switches that aren't stacking properly are Access-7 and Access-8. I'll get the diagnostic files uploaded shortly. Access-8 is not currently connected to the stack as I need the devices connected to it uplinked to our network. Currently the access stack shown is uplinked to our existing core switch (not shown) during transition.

Diagram.png

Message 3 of 8
Greg_11
Aspirant

Re: M4300 stacking problems

I just noticed.. On the diagram, the uplinks from the CORE stack to the firewall are only 1G, not 10G even though the legend says it all is.

Message 4 of 8
DanielZhang
NETGEAR Expert

Re: M4300 stacking problems

Hi Greg,

 

I suggest you to upgrade the M4300 to latest firmware before we continue troubleshooting this case.

You can find the firmware from following link:

M4300 Firmware Version 12.0.2.10

 

I also have some question and suggestion for you.

1.  The port link between stack_1 and stack_2 should be Ethernet mode not the Stack mode.

     (I can't see the LAG configure on master/standby.)

2.  Could you share us the techsupport file on stack_2?

 

Message 5 of 8
Greg_11
Aspirant

Re: M4300 stacking problems

I'll see if I can get the firmware updated tonight. The LAG ports are all ethernet, and right now only one is connected and it isn't configured as a LAG. They were, but it kept going down. One side would report "D-Disabled" and I couldn't fix it, so I broke the LAG and am just using one uplink right now. I was going to save that problem for another post. I can still upload the tech support for the other stack though when I get into the office in a couple of hours.

Message 6 of 8
DanielZhang
NETGEAR Expert

Re: M4300 stacking problems

Hi Greg,

 

Looks like a STP BPDU storm appeared in this topology of port LAG 1 on Stack_1.

 

lag 1            D-Disable                       Down   Disable N/A    Disable
lag 2            Enable                          Up     Enable  N/A    Disable
lag 3            Enable                          Up     Disable N/A    Disable
lag 4            Enable                          Up     Disable N/A    Disable

So the port which get many STP BPDU packets will shutdown as D-Disable.

I suggest to check the link first if any loop exist probably.

1. Check the link of Stack_2 .

2. Check the link of Stack_1 .

3. Check the link between Stack_1 and Stack_2.

If the error appeared all the same after the loop disappear.

 

The best way to resolve your concern is creating online support case form here about stack configuration and PVSTP design.

 

 

loop.png

 

loop_lag1.png

Message 7 of 8
Greg_11
Aspirant

Re: M4300 stacking problems

The BPDU storm I definitely noticed and corrected by removing the LAG between CORE and ACCESS stacks for now until the stacking issue is corrected. There is currently only a single uplink between the stacks which seems to be working just fine. As I said previously, the LAG isn't what I'm currently having an issue with or why I created this thread. My problem is the stacking issue where switches added to the stack will not forward traffic once they're stack members. The 1G ports are greyed out in device view. No activity lights come on on the switch except the stack ports.

 

The stacking problem and the LAG problem are independent problems. The LAG issue was a configuration error I'm sure, but I don't think the stacking problem is. I received a PM from AlexPe regarding the stacking issue on Thursday with an internal ticket number and questions regarding the TS file he had been provided, but it was for the CORE stack, which I'm not having any stacking issues with. I provided him with the correct TS file from the ACCESS stack and additional detailed information, but have not heard back yet.

Message 8 of 8
Top Contributors
Discussion stats
  • 7 replies
  • 5732 views
  • 0 kudos
  • 2 in conversation
Announcements