- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
Re: Network performance slowdown after a few hours
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Network performance slowdown after a few hours
I have a ReadyNAS NVX connected to a Netgear GS108T smart switch through two CAT6 cables using link aggregation with LACP on the switch. The ReadyNAS ports are set up as 803.2ad LACP, with Layer 3+4 xmit hash policy. (The switch ports and LAG are set up as edge ports but either on or off doesn't seem to affect the problem).
Network settings from logs below.
Typically when I start the NAS up I can copy large files to my desktop at around 50-65 MB/s, which is quite reasonable considering the hardware limitations. However after a few hours of heavy file copying the speed goes down to 6-7 MB/s.
Rebooting the ReadyNAS speeds things right back up, so I can't help but think that something on the ReadyNAS degrades over time. Perhaps a full cache or something similar.
I'm looking for any possible solutions to this issue.
bond0 Link encap:Ethernet HWaddr 00:22:3F:A9:FE:56 inet addr:10.0.0.27 Bcast:10.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::222:3fff:fea9:fe56/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:87804911 errors:0 dropped:441 overruns:0 frame:0 TX packets:254931345 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:752847125 (717.9 MiB) TX bytes:374615947 (357.2 MiB) eth0 Link encap:Ethernet HWaddr 00:22:3F:A9:FE:56 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:87735909 errors:0 dropped:0 overruns:0 frame:0 TX packets:43207 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:739297710 (705.0 MiB) TX bytes:16557638 (15.7 MiB) Base address:0xde80 Memory:fea40000-fea60000 eth1 Link encap:Ethernet HWaddr 00:22:3F:A9:FE:56 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:69002 errors:0 dropped:6 overruns:0 frame:0 TX packets:254888138 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:13549415 (12.9 MiB) TX bytes:358058309 (341.4 MiB) Base address:0xdf00 Memory:fea60000-fea80000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:29326 errors:0 dropped:0 overruns:0 frame:0 TX packets:29326 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2084454 (1.9 MiB) TX bytes:2084454 (1.9 MiB) tunl0 Link encap:IPIP Tunnel HWaddr NOARP MTU:1480 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) IPV6_ASSIGN_eth0=STATELESS JUMBO_FRAMES_eth0=0 SPEED_DUPLEX_eth0=AUTO_NEGOTIATION Settings for bond0: Link detected: yes Settings for eth1: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: d Current message level: 0x00000007 (7) Link detected: yes Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: d Current message level: 0x00000007 (7) Link detected: yes bond0:4:eth0,eth1:1
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
What firmware are you running? Also, what OS is the PC running?
FWIW, the bond will not increase your single-user speeds, so you could try disconnecting one of the cables and see if that makes any difference. My guess is that it won't though.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
Yep, I tried single cable. No difference. I've tried connecting with only one switch between PC and ReadyNAS as well with the same results. Right now I'm running a Netgear GS108T smart switch for the NAS, connecting with two cables (LACP with RSTP) to a Netgear GS716T smart switch which connects to the PC.
Maybe it has something to do with the networking between the switches, but that wouldn't explain why restarting the ReadyNAS fixes it. Or?
ReadyNAS firmware is 4.2.2.28, and I'm on Windows 10. However I had similar problems in Windows 7 and 8.1, but they didn't appear as "reliably" as now.
Model: ReadyNAS NVX Business Edition [X-RAID2] Serial: 23N5047F00839 Firmware: RAIDiator 4.2.28 Memory: 1024 MB [6-6-6-24 DDR2]
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
I think you've ruled that out. If you want to be 100% certain you could try a direct connect. Maybe next time try rebooting the PC instead of the NAS, and see if it also recovers. How are you doing the copies?
@Starlionblue3 wrote:
Maybe it has something to do with the networking between the switches...
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
I'll try rebooting the PC next time and we'll see.
I'm using TeraCopy for the file copies. However I have the same issue with Windows native file copy mechanism.
On a side note, TeraCopy is outstanding and is how the native Windows copy should work.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
@Starlionblue3 wrote:
I'm using TeraCopy for the file copies.
Is it verifying, or just copying? Also, perhaps check that Win10 isn't set up to make the network share available off-line.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
Just copying, I don't use the verify function, and Win 10 doesn't store the share offline.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
It is a puzzle.
It still might be useful to see if rebooting the PC on the next occurance helps.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
I managed to test this after the latest instance. Rebooting the PC does not help. Rebooting the ReadyNAS still consistently does.
On a side note, this slowdown occurred overnight and I hadn't even done any copying. I'm starting to get the impression that this is a time-based thing more than a volume of files copied thing
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
I've now tested with a single cable from the ReadyNAS to the switch.
Initially transfers are at around 50-65 MB/s as expected. However after an hour or so I'm only getting around half that. This is still about 4-5 times faster than what I'm getting after an hour with teamed NICs on the ReadyNAS.
Rebooting the ReadyNAS brings the speed back up to 50-65.
This would seem to indicate that the problem is indeed with the ReadyNAS, or at least how it talks to the switch.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
Can you send me your logs (see the Sending Logs link in my sig)?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
@Starlionblue3 wrote:
This would seem to indicate that the problem is indeed with the ReadyNAS, or at least how it talks to the switch.
DId you look at the GS108T stats for the NAS port(s). You can of course also look at the stats for the link between the switches and the GS716T<->PC.
Also, check that flow control is enabled in both switches (I think it is off by default). Though it seems unlikely that persistent queue overflow would take hours to occur, it is conceivable.
You said that you tested with the NAS and the PC connected to one switch - one last thing you could try is connecting both to the other switch.
If you aren't seeing packet loss, and flow control is enabled, and the problem occurs with both smartswitches then I think you've ruled out the network. That would leave the ReadyNAS.
FWIW, my pro-6 uses LACP with my GS724T switch, and I haven't ever experienced your problem.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
First of all, let me say thank you for your persistence.
I turned on Flow Control (indeed off by default) but there was no change.
I have tried with the other switch. Same problem.
I have looked at the stats but not quite sure what I'm looking for. Error counters are at zero in any case. Any hint at what stats are relevant?
Maybe it's just time to buy a new ReadyNAS.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
@Starlionblue3 wrote:
I have looked at the stats but not quite sure what I'm looking for. Error counters are at zero in any case. Any hint at what stats are relevant?
Basically the error counts:
Total Packets Received with MAC Errors
Jabbers Received
Fragments Received
Undersize Received
Alignment Errors
Rx FCS Errors
Overruns
Total Transmit Errors
Tx FCS Errors
Underrun Errors
Total Transmit Packets Discarded
Single Collision Frames
Multiple Collision Frames
Excessive Collision Frames
Port Membership Discards
Overruns is probably the most interesting, as non-zero there means the switch is dropping valid packets it received. If that is non-zero, then look at the 802.3x pause frame counts (which will be non-zero if flow control is actually kicking in).
@Starlionblue3 wrote:
Maybe it's just time to buy a new ReadyNAS.
Could be
I think its safe to say you've ruled out the network and the PC. That leaves the NAS.
If its software related then a factory reset might clear it - but that of course requires rebuilding the NAS and restoring all the data. Perhaps mdgm will see something in the logs.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
Here are those stats for the port that is connected to the NAS. Seems pretty ok to me. This is for the last 4 days. I've cleared the stats so I can see going forward.
Interface MST ID ifIndex 2 Port Type Port Channel ID Disable Port Role Designated STP Mode STP State Forwarding Admin Mode Enable LACP Mode Enable Physical Mode Auto Physical Status 1000 Mbps Full Duplex Link Status Link Up Link Trap Enable Packets RX and TX 64 Octets 42060716 Packets RX and TX 65-127 Octets 38762274 Packets RX and TX 128-255 Octets 61169716 Packets RX and TX 256-511 Octets 3915622 Packets RX and TX 512-1023 Octets 7158940 Packets RX and TX 1024-1518 Octets 699265981 Packets RX and TX > 1522 Octets 0 Octets Received 1009375168145 Packets Received 64 Octets 4097217 Packets Received 65-127 Octets 23235280 Packets Received 128-255 Octets 49211835 Packets Received 256-511 Octets 1030613 Packets Received 512-1023 Octets 6980459 Packets Received 1024-1518 Octets 654130809 Packets Received > 1522 Octets 0 Total Packets Received Without Errors 738686213 Unicast Packets Received 738661091 Multicast Packets Received 22476 Broadcast Packets Received 2646 Total Packets Received with MAC Errors 0 Jabbers Received 0 Fragments Received 0 Undersize Received 0 Alignment Errors 0 Rx FCS Errors 0 Overruns 0 802.3x Pause Frames Received 2 Broadcast Storm Recovery 0 Total Packets Transmitted (Octets) 75842619485 Packets Transmitted 64 Octets 37963499 Packets Transmitted 65-127 Octets 15526994 Packets Transmitted 128-255 Octets 11957881 Packets Transmitted 256-511 Octets 2885009 Packets Transmitted 512-1023 Octets 178481 Packets Transmitted 1024-1518 Octets 45135172 Packets Transmitted > 1522 Octets 0 Maximum Frame Size 9216 Total Packets Transmitted Successfully 113647036 Unicast Packets Transmitted 109339356 Multicast Packets Transmitted 2266419 Broadcast Packets Transmitted 2041261 Total Transmit Errors 0 Tx FCS Errors 0 Underrun Errors 0 Total Transmit Packets Discarded 0 Single Collision Frames 0 Multiple Collision Frames 0 Excessive Collision Frames 0 Port Membership Discards 0 STP BPDUs Received 0 STP BPDUs Transmitted 0 RSTP BPDUs Received 0 RSTP BPDUs Transmitted 46153 MSTP BPDUs Received 0 MSTP BPDUs Transmitted 0 802.3x Pause Frames Transmitted 18692 EAPOL Frames Received 0 EAPOL Frames Transmitted 0 Time Since Counters Last Cleared 4 day 22 hr 13 min 35 sec
If its software related then a factory reset might clear it - but that of course requires rebuilding the NAS and restoring all the data. Perhaps mdgm will see something in the logs.
I could do an OS reinstall. That would clear the NIC settings. I'll try that once I have time. Probably tomorrow.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
The stats look fine to me also. Of course you should check all the ports on the path between the NAS and the PC.
An OS reinstall is a long shot (I don't really think its the NIC settings). More likely its some other performance bottleneck in the NAS - memory perhaps. A filling OS partition can have strange effects, but yours isn't one I've seen before.
However, the OS reinstall won't hurt anything either, and is simple to do. You could also enable ssh, and run top while the transfer is running - that might give some more info.
mdgm - did you have a chance to look at the logs?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
A while ago I had a rather serious Frontview issue caused by excessive caching which you fixed for me. See this thread. Could this be related?
You could also enable ssh, and run top while the transfer is running - that might give some more info.
Could you please tell me how to do that? It's been a while since I did ssh stuff. 😉
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
I use putty on the PC to connect with ssh.
While there, you might as well check the OS partition size
df . -i
df . -h
To run top, run putty full-screen and type
top
ctrl-c exits.
There are a few options to sort the process order - you can google the man page for that. The overall cpu % and memory use should give you some idea of whether the NAS software is bogging down. So note what that is before you begin, when the test is running full-speed, and after it stalls out.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
Monitoring with top, Samba Daemon (smdb) has following CPU utilization as follows during large file copies:
- No file copy. 0%.
- Streaming a movie. 1-2%.
- Single NIC connection. 20-30%.
- Teamed NIC with LACP iniitlaly. 50%.
- Teamed NIC with LACP after slowdown. 10-30%.
This time an interesting thing happened as I was watching it during the teamed NIC test. The speed came back up to full by itself. Haven't seen that before. The CPU utilization remained around 15%. So there might be a correlation with CPU utilization. If the utilization becomes quite high, speeds seem to slow. I was getting top speeds but utilization remained around 15%, and just before slowdown it was quite high.
Df results below.
Atlas:~# df . -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/md0 65536 13592 51944 21% /
Atlas:~# df . -h Filesystem Size Used Avail Use% Mounted on /dev/md0 4.0G 543M 3.3G 14% /
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
Since I've reinstated the link aggregation between NAS and switch everything seems fine. Maybe the LAG configuration on the switch was somehow corrupt?
I'm going to go bang my head on the wall now...
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
@Starlionblue3 wrote:
I'm going to go bang my head on the wall now...
I understand the feeling. Looking back, I should have suggested rebooting the switches (not sure if that would have helped).
Let us know if the problem recurs.
BTW, your OS partition stats look fine.
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re: Network performance slowdown after a few hours
Tested this morning. 12 hours on and speed is still good.
Thanks for checking the logs!