NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
Starlionblue3
Jan 26, 2016Tutor
Network performance slowdown after a few hours
I have a ReadyNAS NVX connected to a Netgear GS108T smart switch through two CAT6 cables using link aggregation with LACP on the switch. The ReadyNAS ports are set up as 803.2ad LACP, with Layer 3+4 ...
StephenB
Jan 27, 2016Guru - Experienced User
It is a puzzle.
It still might be useful to see if rebooting the PC on the next occurance helps.
Starlionblue3
Jan 27, 2016Tutor
I managed to test this after the latest instance. Rebooting the PC does not help. Rebooting the ReadyNAS still consistently does.
On a side note, this slowdown occurred overnight and I hadn't even done any copying. I'm starting to get the impression that this is a time-based thing more than a volume of files copied thing
- Starlionblue3Jan 27, 2016Tutor
I've now tested with a single cable from the ReadyNAS to the switch.
Initially transfers are at around 50-65 MB/s as expected. However after an hour or so I'm only getting around half that. This is still about 4-5 times faster than what I'm getting after an hour with teamed NICs on the ReadyNAS.
Rebooting the ReadyNAS brings the speed back up to 50-65.
This would seem to indicate that the problem is indeed with the ReadyNAS, or at least how it talks to the switch.
- mdgm-ntgrJan 27, 2016NETGEAR Employee Retired
Can you send me your logs (see the Sending Logs link in my sig)?
- Starlionblue3Jan 27, 2016Tutor
Done
- StephenBJan 27, 2016Guru - Experienced User
Starlionblue3 wrote:
This would seem to indicate that the problem is indeed with the ReadyNAS, or at least how it talks to the switch.
DId you look at the GS108T stats for the NAS port(s). You can of course also look at the stats for the link between the switches and the GS716T<->PC.
Also, check that flow control is enabled in both switches (I think it is off by default). Though it seems unlikely that persistent queue overflow would take hours to occur, it is conceivable.
You said that you tested with the NAS and the PC connected to one switch - one last thing you could try is connecting both to the other switch.
If you aren't seeing packet loss, and flow control is enabled, and the problem occurs with both smartswitches then I think you've ruled out the network. That would leave the ReadyNAS.
FWIW, my pro-6 uses LACP with my GS724T switch, and I haven't ever experienced your problem.
- Starlionblue3Jan 28, 2016Tutor
First of all, let me say thank you for your persistence. :smileyhappy:
I turned on Flow Control (indeed off by default) but there was no change.
I have tried with the other switch. Same problem.
I have looked at the stats but not quite sure what I'm looking for. Error counters are at zero in any case. Any hint at what stats are relevant?
Maybe it's just time to buy a new ReadyNAS. :smileywink:
- StephenBJan 28, 2016Guru - Experienced User
Starlionblue3 wrote:
I have looked at the stats but not quite sure what I'm looking for. Error counters are at zero in any case. Any hint at what stats are relevant?
Basically the error counts:
Total Packets Received with MAC Errors
Jabbers Received
Fragments Received
Undersize Received
Alignment Errors
Rx FCS Errors
Overruns
Total Transmit Errors
Tx FCS Errors
Underrun Errors
Total Transmit Packets Discarded
Single Collision Frames
Multiple Collision Frames
Excessive Collision Frames
Port Membership DiscardsOverruns is probably the most interesting, as non-zero there means the switch is dropping valid packets it received. If that is non-zero, then look at the 802.3x pause frame counts (which will be non-zero if flow control is actually kicking in).
Starlionblue3 wrote:
Maybe it's just time to buy a new ReadyNAS. :smileywink:
Could be :smileyhappy:
I think its safe to say you've ruled out the network and the PC. That leaves the NAS.
If its software related then a factory reset might clear it - but that of course requires rebuilding the NAS and restoring all the data. Perhaps mdgm will see something in the logs.
- Starlionblue3Jan 28, 2016Tutor
Here are those stats for the port that is connected to the NAS. Seems pretty ok to me. This is for the last 4 days. I've cleared the stats so I can see going forward.
Interface MST ID ifIndex 2 Port Type Port Channel ID Disable Port Role Designated STP Mode STP State Forwarding Admin Mode Enable LACP Mode Enable Physical Mode Auto Physical Status 1000 Mbps Full Duplex Link Status Link Up Link Trap Enable Packets RX and TX 64 Octets 42060716 Packets RX and TX 65-127 Octets 38762274 Packets RX and TX 128-255 Octets 61169716 Packets RX and TX 256-511 Octets 3915622 Packets RX and TX 512-1023 Octets 7158940 Packets RX and TX 1024-1518 Octets 699265981 Packets RX and TX > 1522 Octets 0 Octets Received 1009375168145 Packets Received 64 Octets 4097217 Packets Received 65-127 Octets 23235280 Packets Received 128-255 Octets 49211835 Packets Received 256-511 Octets 1030613 Packets Received 512-1023 Octets 6980459 Packets Received 1024-1518 Octets 654130809 Packets Received > 1522 Octets 0 Total Packets Received Without Errors 738686213 Unicast Packets Received 738661091 Multicast Packets Received 22476 Broadcast Packets Received 2646 Total Packets Received with MAC Errors 0 Jabbers Received 0 Fragments Received 0 Undersize Received 0 Alignment Errors 0 Rx FCS Errors 0 Overruns 0 802.3x Pause Frames Received 2 Broadcast Storm Recovery 0 Total Packets Transmitted (Octets) 75842619485 Packets Transmitted 64 Octets 37963499 Packets Transmitted 65-127 Octets 15526994 Packets Transmitted 128-255 Octets 11957881 Packets Transmitted 256-511 Octets 2885009 Packets Transmitted 512-1023 Octets 178481 Packets Transmitted 1024-1518 Octets 45135172 Packets Transmitted > 1522 Octets 0 Maximum Frame Size 9216 Total Packets Transmitted Successfully 113647036 Unicast Packets Transmitted 109339356 Multicast Packets Transmitted 2266419 Broadcast Packets Transmitted 2041261 Total Transmit Errors 0 Tx FCS Errors 0 Underrun Errors 0 Total Transmit Packets Discarded 0 Single Collision Frames 0 Multiple Collision Frames 0 Excessive Collision Frames 0 Port Membership Discards 0 STP BPDUs Received 0 STP BPDUs Transmitted 0 RSTP BPDUs Received 0 RSTP BPDUs Transmitted 46153 MSTP BPDUs Received 0 MSTP BPDUs Transmitted 0 802.3x Pause Frames Transmitted 18692 EAPOL Frames Received 0 EAPOL Frames Transmitted 0 Time Since Counters Last Cleared 4 day 22 hr 13 min 35 sec
If its software related then a factory reset might clear it - but that of course requires rebuilding the NAS and restoring all the data. Perhaps mdgm will see something in the logs.
I could do an OS reinstall. That would clear the NIC settings. I'll try that once I have time. Probably tomorrow.
- StephenBJan 28, 2016Guru - Experienced User
The stats look fine to me also. Of course you should check all the ports on the path between the NAS and the PC.
An OS reinstall is a long shot (I don't really think its the NIC settings). More likely its some other performance bottleneck in the NAS - memory perhaps. A filling OS partition can have strange effects, but yours isn't one I've seen before.
However, the OS reinstall won't hurt anything either, and is simple to do. You could also enable ssh, and run top while the transfer is running - that might give some more info.
mdgm - did you have a chance to look at the logs?
- Starlionblue3Jan 28, 2016Tutor
A while ago I had a rather serious Frontview issue caused by excessive caching which you fixed for me. See this thread. Could this be related?
You could also enable ssh, and run top while the transfer is running - that might give some more info.
Could you please tell me how to do that? It's been a while since I did ssh stuff. ;)
- StephenBJan 28, 2016Guru - Experienced User
I use putty on the PC to connect with ssh.
While there, you might as well check the OS partition size
df . -i
df . -h
To run top, run putty full-screen and type
top
ctrl-c exits.
There are a few options to sort the process order - you can google the man page for that. The overall cpu % and memory use should give you some idea of whether the NAS software is bogging down. So note what that is before you begin, when the test is running full-speed, and after it stalls out.
- Starlionblue3Jan 29, 2016Tutor
Monitoring with top, Samba Daemon (smdb) has following CPU utilization as follows during large file copies:
- No file copy. 0%.
- Streaming a movie. 1-2%.
- Single NIC connection. 20-30%.
- Teamed NIC with LACP iniitlaly. 50%.
- Teamed NIC with LACP after slowdown. 10-30%.
This time an interesting thing happened as I was watching it during the teamed NIC test. The speed came back up to full by itself. Haven't seen that before. The CPU utilization remained around 15%. So there might be a correlation with CPU utilization. If the utilization becomes quite high, speeds seem to slow. I was getting top speeds but utilization remained around 15%, and just before slowdown it was quite high.
Df results below.
Atlas:~# df . -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/md0 65536 13592 51944 21% /
Atlas:~# df . -h Filesystem Size Used Avail Use% Mounted on /dev/md0 4.0G 543M 3.3G 14% /
- Starlionblue3Jan 29, 2016Tutor
Since I've reinstated the link aggregation between NAS and switch everything seems fine. Maybe the LAG configuration on the switch was somehow corrupt?
I'm going to go bang my head on the wall now...
- StephenBJan 29, 2016Guru - Experienced User
Starlionblue3 wrote:
I'm going to go bang my head on the wall now...
I understand the feeling. Looking back, I should have suggested rebooting the switches (not sure if that would have helped).
Let us know if the problem recurs.
BTW, your OS partition stats look fine.
- Starlionblue3Jan 30, 2016Tutor
Tested this morning. 12 hours on and speed is still good.
Thanks for checking the logs!
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!