NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

jlehtinen's avatar
jlehtinen
Aspirant
Mar 29, 2013

ESXi reports "All Paths Down" for ReadyNAS hosted NFS share

Hiya - looking for some feedback from the community on an issue I'm seeing. Thanks in advance for any insights.

Some background:
We're using two ReadyNAS 3200's to host virtual machines via NFS.
ESXi hosts are running ESXi 5.1.
The ReadyNAS units are running 4.2.19, and have "adaptive load balancing" set on the NICs.

Issue:
I'm seeing some of the ESXi hosts report that NFS shares enter "All Paths Down" state for 6-7 seconds, before exiting this status and reconnecting. This happens for BOTH ReadyNAS units, and on 9 ESXi hosts - with no solid pattern on which host is impacted OR which ReadyNAS shows as "All Paths Down". It DOES appear to be related to the current load on the ReadyNAS. For example, if I start a backup job, I can expect to see this error on 3-4 ESXi hosts at least. I believe this has been happening for awhile without anyone noticing - but it caused a HUGE issue 2 weeks ago, when one of the ReadyNAS units entered/exited "All Paths Down" state nonstop while backups were running. (I opened a support case with Netgear and submitted the logs but they could not explain why this happened.)

Current theory:
From what I can tell, adaptive load balancing causes the ReadyNAS to change what MAC address (and NIC) is receiving traffic for a certain percentage of the overall traffic. It's my guess that when I run backups (or do anything else load intensive), the ReadyNAS attempts to load balance some of the traffic going to the ESXi hosts. The resulting change to the MAC address being reported to the ESXi host causes ESXi to report "all paths down" briefly before the new MAC address/NIC resolves correctly.

The issue we experienced must have been due to a glitch or bug in the load balancing, which caused the ReadyNAS to fail to "stabilize" the load balancing correctly. I was only able to stabilize the unit by power cycling it.

Questions:
1.) Does this sound like a plausible theory? My current thinking is I should disable load balancing and go to active-backup configuration to see if this resolves the issue.

2.) Will a firmware update resolve this issue? I reviewed the firmware patch notes and none of them mention NFS stability with NIC teaming.

22 Replies

Replies have been turned off for this discussion

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More