Reply

Re: Routers dropping characters in their log file?

Donth
Aspirant

Routers dropping characters in their log file?

I have my logs mailed to me on a regular basis and scan them.  

Every once in a while, the log file text is becoming corrupt and drops characters.  

 

For example:

 

[DoS Attack: RST Scn] rmsuc:2524.175.102, port 80, Tuesday, April 24, 2018 13:23:46
[DoS Attack: RST Scan] from source: 205.234.175.102, port 80, Tuesday, April 24, 2018 13:23:33

This is obviously the same message with dropped characters.  The top line should read

"[Dos Attack: RST Scan] from source: 205.234.175.102, port 80 Tuesday April 24, 2018 13:23:46"

 

It happens on a regular basis.  And the characters drop on a random occurance.  So what's going on here?  It's worth investigating because it suggest memory or NAND corruption.  A simple log file readback after write will verify the log file and if it doesn't match you can trap the code to stop and examine what happened.



Model: R7800|Nighthawk X4S AC2600 WiFi Router
Message 1 of 9
schumaku
Guru

Re: Routers dropping characters in their log file?

Same issue in the log when viewed on the router itself?

Message 2 of 9
IrvSp
Master

Re: Routers dropping characters in their log file?

I've seen this on my old R7000 and R8000 I now use. Right now I can't locate it in my e-mailed logs? I'm wondering if I only saw it on the LOG tab on the browser?

 

I've always seen near other entries seconds apart, so I attributed them to the router being busy?

 

Ahh, did find it, I posted it to THIS THREAD, msg 30:

 

[USB device attached] The USB storage device WD 10EADS External (USB_ROUTER) is attached to the rou Friday, Apr 20,2018 15:42:07
[USB device detached] The USB storage device WD 10EADS External (USB_ROUTER) is removed from the ro Friday, Apr 20,2018 15:41:41

 

'ro' should be router.

 

There are other entries there with the other errors in the  message, like in message 22:

 

[USB device attached] The USB storage device WD 10EADS External (USB_ROUTER) is attached to the  Wednesday, Apr 18,2018 10:15:07
[USB device detached] The USB storage device WD 10EADS External (USB_ROUTER) is removed from the Wednesday, Apr 18,2018 10:14:53

 

'router' missing in both.

 

It seems since I rebooted that hasn't happened again (corruption I mean as I have many messages like the above in logs from later dates).

Message 3 of 9
schumaku
Guru

Re: Routers dropping characters in their log file?

@IrvS wrote:

 

...

I've always seen near other entries seconds apart, so I attributed them to the router being busy?

 

[USB device attached] The USB storage device WD 10EADS External (USB_ROUTER) is attached to the rou Friday, Apr 20,2018 15:42:07
[USB device detached] The USB storage device WD 10EADS External (USB_ROUTER) is removed from the ro Friday, Apr 20,2018 15:41:41

 

'ro' should be router.

 

There are other entries there with the other errors in the  message, like in message 22:

 

[USB device attached] The USB storage device WD 10EADS External (USB_ROUTER) is attached to the  Wednesday, Apr 18,2018 10:15:07
[USB device detached] The USB storage device WD 10EADS External (USB_ROUTER) is removed from the Wednesday, Apr 18,2018 10:14:53

 

'router' missing in both.

 

It seems since I rebooted that hasn't happened again (corruption I mean as I have many messages like the above in logs from later dates).


Different from the random character loss, this looks to me more like a systematic truncation - typical (mis-)behaviour of embedded system developers - thinking and desging things in static tables. Let's look at this (and some log entries from my R9000) in fixed pitch text:

 

Fixed width log.PNG

 

Remains the question what is randomly concatenating the name of the external device here, too.

Message 4 of 9
Donth
Aspirant

Re: Routers dropping characters in their log file?

It appears in both emails sent and the router log as viewed from the webpage.

 

Something is definitely amiss.  Maybe bad NAND?  Writing to it too fast?

 

I wonder if theres a way to underclock the processor say like 10% and see if these errors go away.

Message 5 of 9
IrvSp
Master

Re: Routers dropping characters in their log file?


@Donth wrote:

It appears in both emails sent and the router log as viewed from the webpage.

 

Something is definitely amiss.  Maybe bad NAND?  Writing to it too fast?

 

I wonder if theres a way to underclock the processor say like 10% and see if these errors go away.


I'd think other than changing the firmware code and waiting to write when not busy (could require a large buffer depending on how much has to be written in a short time) there isn't any real way to fix it other than that. A lot can be happening in a router you are not aware of. Many apps/programs 'call home' often to check for updates. The router itself has to be able to receive packets and basically determine what has to be done, even if it is to reject pings or other probes.

Message 6 of 9
IrvSp
Master

Re: Routers dropping characters in their log file?

@schumaku,

 

"Different from the random character loss, this looks to me more like a systematic truncation - typical (mis-)behaviour of embedded system developers - thinking and desging things in static tables. Let's look at this (and some log entries from my R9000) in fixed pitch text:"

 

Truncation would normally be the length of a message. Shortening a name doesn't make sense, and always would be the same. The mere fact that it is a 'full' entry sometimes and some parts cut-off other times doesn't make sense in a coding environment. Usually code does the same thing every time. Note that this time I got the FULL message:

 

[USB device attached] The USB storage device WD 10EADS External (USB_ROUTER) is attached to the router, Wednesday, Apr 25,2018 22:17:17
[USB device detached] The USB storage device WD 10EADS External (USB_ROUTER) is removed from the router, Wednesday, Apr 25,2018 22:15:28

Newer routers have faster CPU's and some have off-load devices. Still, I've seen the same thing happen in an R7000 and R8000. I don't think it is a coding choice or problem but a processing problem, and it could be a priority handling situation as the log is considered a lower priority.

Message 7 of 9
Donth
Aspirant

Re: Routers dropping characters in their log file?


@IrvSp wrote:

Newer routers have faster CPU's and some have off-load devices. Still, I've seen the same thing happen in an R7000 and R8000. I don't think it is a coding choice or problem but a processing problem, and it could be a priority handling situation as the log is considered a lower priority.


Well the 7800 doesn't have an offload chip.  So I doubt there's any I2C communications issues.  Even if an interrupt stopped processing, there's no reason the stack and message state shouldn't be preserved.  If it can't than either the chip fails qualification, or there is a firmware issue with handling the interrupts.  I doubt such a serious flaw would pass chip validation, let alone multiple chip model validations (if it appears on multiple models)  So this leaves firmware as the culprit.  And you might have guessed I do have some experience with the internals of how hardware and firmware interact.

An easy test would be to print the interrupt count on beginning and end of every message and see if there are interrupts called between the start of the format and the end of the format.  Then scan the log line for dropping of characters, which should be easy to do.

 

It would slow things down a bit, but it's only needed for a developer debug environment.
If it does appear to be interrupt related, log entries when formatted or written can be frozen with a lock which temporarily disables the interrupt.

Message 8 of 9
IrvSp
Master

Re: Routers dropping characters in their log file?

"Well the 7800 doesn't have an offload chip."

 

True, but my R8000 has 3 of them.

 

"If it can't than either the chip fails qualification, or there is a firmware issue with handling the interrupts."

 

Unless you use something like WireShark and can see what is coming into the router (and I don't think and client based PC can do that side of the router) you'll never know if an incoming TCP/IP packet was missed (to a router these could be considered interrupts)?

 

"So this leaves firmware as the culprit."

 

I'm sure no matter what the cause this is the case, be it switching from logging to another task or the f/w purposely dropping the data to go do that other task. What I can't fathom though is the continuation of the log item after some part was dropped. I'd think the only cause could be the router f/w either didn't recover fast enough to get the 'rest' of the info or poor coding. Notice sometimes it was a letter or two, in my case, then end of the log data (I suspect the date/time was captured when the event happened, hence there was no problem appending that to the end of the entry.

 

"And you might have guessed I do have some experience with the internals of how hardware and firmware interact." Your not the only one here with that experience.

 

Of course once NG was informed of the problem it would be easy to test as you described. Fixing it, another story. I think the lowest priority would be proper logging vs. proper interrupt handling for ALL router functions. Until it is know what the cause was, you don't know what a fix could be.

Message 9 of 9
Discussion stats
  • 8 replies
  • 662 views
  • 0 kudos
  • 3 in conversation
Announcements