NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
jmozdzen
Mar 23, 2018Tutor
M4300-24x: spontaneous reboots dispite latest firmware
Hi *, we're running a (until today) single M4300-24x 10G switch, with latest firmware (M4300-24X ProSAFE 20-port 10GBASE-T and 4-port 10G combo, 12.0.2.20, 1.0.0.9). Both with earlierer and current ...
- Sep 20, 2018
I suggest you to open a chat or online support ticket with NETGEAR Support then describe your concern and include the logs you have posted for it to be analyzed. If ever the switch has been declared faulty, be ready to submit a .doc or .pdf copy of the Proof of Purchase or Sales Invoice of it for warranty verification. Then, if the hardware warranty is still valid, an online replacement will follow.
Regards,
DaneA
NETGEAR Community Team
jmozdzen
Sep 18, 2018Tutor
So we've had a number of spontaneous reboots again over the last weeks/months. As we now have a redundant stack, operations aren't affected that much anymore and we usally only find out by looking at the syslog messages.
Interestingly, it's always the same switch that is rebooting, stack switch #1, no matter if it is the "active unit" (after power on of the complete stack) or the backup unit (after the first spontaneous reboot of this switch #1, which makes stack switch #2 the "active unit").
As I had a confguration issue clobbering the log with it's messages, Iwanted to clean that up first, before reporting back - I now have a clean log, here's what I found for the recent reboot:
--- cut here ---
Sep 13 21:27:32 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045333 %% Stack-port link down: Index: 217 Unit: 2 Tag: 0/24
Sep 13 21:27:33 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045335 %% Stack-port link down: Index: 216 Unit: 2 Tag: 0/23
Sep 13 21:27:34 s-22455-02-05 CKPT[tCkptSvc]: ckpt_task.c(363) 1045336 %% Checkpoint message transmission to unit 1 failed for LLDP(85).
Sep 13 21:27:34 s-22455-02-05 CKPT[tCkptSvc]: ckpt_task.c(363) 1045337 %% Checkpoint message transmission to unit 1 failed for LLDP(85).
Sep 13 21:27:34 s-22455-02-05 CKPT[tCkptSvc]: ckpt_task.c(363) 1045338 %% Checkpoint message transmission to unit 1 failed for LLDP(85).
Sep 13 21:27:39 s-22455-02-05 TRAPMGR[dot1s_task]: traputil.c(763) 1045339 %% Spanning Tree Topology Change Received: MSTID: 0 lag 21
Sep 13 21:27:40 s-22455-02-05 TRAPMGR[dot1s_task]: traputil.c(763) 1045340 %% Spanning Tree Topology Change Received: MSTID: 0 lag 21
Sep 13 21:27:42 s-22455-02-05 TRAPMGR[dot1s_task]: traputil.c(763) 1045341 %% Spanning Tree Topology Change Received: MSTID: 0 lag 21
Sep 13 21:27:46 s-22455-02-05 CKPT[tCkptSvc]: ckpt_task.c(487) 1045343 %% Backup manager removed.
Sep 13 21:27:46 s-22455-02-05 VOIP[tCkptSvc]: voip_ckpt.c(174) 1045344 %% Backup unit gone
Sep 13 21:27:46 s-22455-02-05 UNITMGR[unitMgrTask]: unitmgr.c(8116) 1045345 %% No Potential unit to configure as Standby when unit 1 left
Sep 13 21:27:52 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045347 %% Entity Database: Configuration Changed
Sep 13 21:28:16 s-22455-02-05 DRIVER[hapiL3AsyncTask]: broad_hpc_rpc.c(1041) 1045348 %% hpcHardwareRpc: RPC Timeout for transaction 6018
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045349 %% Interface 1/0/1 detached from ndesan01.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045350 %% Interface 1/0/2 detached from ndesan02.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045351 %% Interface 1/0/3 detached from ndesan03.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045352 %% Interface 1/0/4 detached from ndesan04.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045353 %% Interface 1/0/5 detached from ndemds01.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045354 %% Interface 1/0/9 detached from compute1.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045355 %% Interface 1/0/10 detached from compute2.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045356 %% Interface 1/0/11 detached from compute3.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045357 %% Interface 1/0/12 detached from compute4.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045358 %% Interface 1/0/17 detached from nde32.
Sep 13 21:28:16 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045359 %% Interface 1/0/18 detached from control1.
Sep 13 21:28:17 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(1014) 1045360 %% Interface 1/0/21 detached from s-22455-02-01.
Sep 13 21:28:22 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045362 %% Entity Database: Configuration Changed
Sep 13 21:28:39 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045364 %% Stack-port link up: Index: 216 Unit: 2 Tag: 0/23
Sep 13 21:28:39 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045366 %% Stack-port link down: Index: 216 Unit: 2 Tag: 0/23
Sep 13 21:28:39 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045368 %% Stack-port link up: Index: 217 Unit: 2 Tag: 0/24
Sep 13 21:28:39 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045370 %% Stack-port link up: Index: 216 Unit: 2 Tag: 0/23
Sep 13 21:28:40 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045372 %% Stack-port link down: Index: 216 Unit: 2 Tag: 0/23
Sep 13 21:28:40 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045374 %% Stack-port link up: Index: 216 Unit: 2 Tag: 0/23
Sep 13 21:28:56 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045387 %% Stack-port link up: Index: 116 Unit: 1 Tag: 0/23
Sep 13 21:28:56 s-22455-02-05 TRAPMGR[spmTask]: traputil.c(763) 1045388 %% Stack-port link up: Index: 117 Unit: 1 Tag: 0/24
Sep 13 21:28:59 s-22455-02-05 CKPT[tCkptSvc]: ckpt_task.c(523) 1045418 %% New backup manager selected, unit 1.
Sep 13 21:28:59 s-22455-02-05 CKPT[tCkptSvc]: ckpt_task.c(423) 1045419 %% Checkpoint operation to backup unit 1 complete.
Sep 13 21:29:06 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045421 %% Link Up: 1/0/21
Sep 13 21:29:07 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045427 %% Link Up: 1/0/5
Sep 13 21:29:07 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045428 %% Link Up: 1/0/10
Sep 13 21:29:08 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045429 %% Link Up: 1/0/9
Sep 13 21:29:08 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045430 %% Link Up: 1/0/12
Sep 13 21:29:08 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045431 %% Link Up: 1/0/11
Sep 13 21:29:09 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045433 %% Link Up: 1/0/17
Sep 13 21:29:09 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045434 %% Entity Database: Configuration Changed
Sep 13 21:29:10 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045435 %% Interface 1/0/21 attached to s-22455-02-01.
Sep 13 21:29:10 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045437 %% Link Up: 1/0/2
Sep 13 21:29:10 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045438 %% Interface 1/0/11 attached to compute3.
Sep 13 21:29:10 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045439 %% Interface 1/0/10 attached to compute2.
Sep 13 21:29:10 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045440 %% Interface 1/0/12 attached to compute4.
Sep 13 21:29:10 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045442 %% Link Up: 1/0/1
Sep 13 21:29:10 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045444 %% Link Up: 1/0/3
Sep 13 21:29:10 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045445 %% Interface 1/0/9 attached to compute1.
Sep 13 21:29:10 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045446 %% Interface 1/0/5 attached to ndemds01.
Sep 13 21:29:11 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045448 %% Link Up: 1/0/18
Sep 13 21:29:11 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045449 %% Interface 1/0/17 attached to nde32.
Sep 13 21:29:12 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045450 %% Interface 1/0/2 attached to ndesan02.
Sep 13 21:29:12 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045451 %% Interface 1/0/1 attached to ndesan01.
Sep 13 21:29:12 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045452 %% Interface 1/0/3 attached to ndesan03.
Sep 13 21:29:14 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045453 %% Interface 1/0/18 attached to control1.
Sep 13 21:29:15 s-22455-02-05 TRAPMGR[trapTask]: traputil.c(721) 1045455 %% Link Up: 1/0/4
Sep 13 21:29:18 s-22455-02-05 DOT3AD[dot3ad_core_lac]: dot3ad_db.c(951) 1045457 %% Interface 1/0/4 attached to ndesan04.
Sep 13 21:29:35 s-22455-02-05 UNITMGR[umWorkerTask]: unitmgr.c(7040) 1045459 %% Copy of running configuration to backup unit complete
Sep 13 21:30:58 s-22455-02-05 STACKING[spmTask]: spm.c(1434) 1045464 %% Errors detected on stack-port 1/0/23 (oldRxErrors = 0 currentRxErrors = 1 oldTxErrors = 0 currentTxErrors = 0). Use the stacking diagnostics command to look at detailed statistics.
Sep 13 21:30:58 s-22455-02-05 STACKING[spmTask]: spm.c(1434) 1045465 %% Errors detected on stack-port 1/0/24 (oldRxErrors = 0 currentRxErrors = 1 oldTxErrors = 0 currentTxErrors = 0). Use the stacking diagnostics command to look at detailed statistics.
--- cut here ---
The next message in the log is from Sep 18.
The stack is make up of two M4300-24x, connected via direct stack link cables in 1/0/23 and 1/0/24 to 2/023 and 2/0/24. (The reports about errors on these two ports only every appear right after the module reboot, similar to the two last lines above.)
LAG 21, reporting the STP topo change, is the redundant uplink to a central switch via 1/0/21 and 2/0/21.
The other ports on module 1 that are going down and up are connected redundantly to a server each, with the servers' second link going to the same port on switch module #2, running LACP on the LAG.
I collected all switch reports I could get hold of, to our tftp server, via CLI's "copy nvram:..." commands. The crash log is again an empty file, although the CLI command reported a successful transfer. All other files are transfered successfully resulting in "content" on the TFTP server.
As it is the same physical switch rebooting, I tend to believe it is a hardware error of some sort.
Any idea on how to proceed? Is unit 1 cndidate for a replacement?
With regards,
Jens
DaneA
Sep 20, 2018NETGEAR Employee Retired
I suggest you to open a chat or online support ticket with NETGEAR Support then describe your concern and include the logs you have posted for it to be analyzed. If ever the switch has been declared faulty, be ready to submit a .doc or .pdf copy of the Proof of Purchase or Sales Invoice of it for warranty verification. Then, if the hardware warranty is still valid, an online replacement will follow.
Regards,
DaneA
NETGEAR Community Team
Related Content
NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!