NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

scott_mac's avatar
scott_mac
Aspirant
Aug 31, 2011

VMWare iSCSI Dropouts

Hi Folks,

I've had a search and found a similar issue to what we're having but can't seem to find a definitive solution.... basically i'm having similar issues to this thread: http://www.readynas.com/forum/viewtopic.php?f=126&t=54095

However I am seeing the following errors:

Lost connectivity to storage device naa.60014052e106910005cd001000000000. Path vmhba33:C1:T0:L0 is down. Affected datastores: "SBS Datastore". error 31/08/2011 16:35:10

Connectivity to storage device naa.60014052e106910005cd001000000000 (Datastores: "SBS Datastore") restored. Path vmhba33:C1:T0:L0 is active again. info 31/08/2011 16:35:13


As you can see we're losing the connection for about 3 seconds.... unfortunately in the last 2 weeks we've had 2 major events that have necessitated a full hard reset of both ESX hosts and the ReadyNAS (3200) that we're running on. The issue here was that the datastore was showing within vSphere as "Inactive".

There are no logs showing in the ReadyNAS (unless they're stored elsewhere!) the Health shows everything is fine and once the ReadyNAS is rebooted and the ESX hosts are back up everything works fine... except I am now seeing the random dropouts as above, at the most common they happen every 5 minutes, however there have been none now for an hour. I can live, in the short term, with random dropouts as nothing actually seems to be affected, however I cannot lose our entire system for the time it takes to reboot everything.

It has been suggested that we could have a corrupt vmfs on the "SBS Datastore" as we have one iSCSI target with 3 data stores on it and the other two are working absolutely fine.

I know we do not run the latest firmware as I am terrified of upgrading it at the moment, however nothing has changed in terms of software versions, or hardware for anything yet suddenly this issue has popped up...

Can anyone help?

Thanks in advance

Scott

8 Replies

Replies have been turned off for this discussion
  • Hey Scott,

    You might want to try adjusting your MTU on your hosts and switches. Some switches may require additional encapsulation overhead, and you may need to set the switches at a slightly higher MTU than your end nodes (ESX and your array)

    As it's happening so often, it would be helpful to have a packet trace at the time of the disconnection, you can than analyse it with Wireshark, or, provide it to support in the event the issue isn't immediately clear.

    Cheers!
  • upgrade your version of raidator - easy versions did have issues with dropouts.

    you also should be doing roundrobin iscsi load balancing via multipath
  • What version of RADiator are you running? I am experiencing this issue from time to time as well.
  • Hi,

    We're on 4.2.19 which at the time of writing is the latest. The issue had gone away immediately after upgrading, but recently we had a failure in the unit - one of the disks failed to respond in time and the entire RAID array gave up. Obviously this was far from ideal and recovery has been long but thankfully we're back up and running.. however bizarrely the same error is now happening again.

    Bizarre and very irritating!

    Scott
  • I opened a case this morning for the same issue. I do not see anything in the esxi hosts but I did see a bunch of release messages every 5 seconds in the log of the readynas. I have been having this issue since I installed the unit one month ago. Oddly I am only having a serious issue (total disconnect) on Fridays between 12:00 - 1:00 EST. The tech support agent today had me use the same time servers(factory setting is use netgears time servers) as the esxi hosts and we turned off Auto update checking feature. I will post back Friday with results. Netgear I hope you are listening I have purchased 4 of these for my customers and one will be installed this friday for a production enviornment.
  • Having a similar issue. Updated to 4.2.19 the latest version and now drop offs once per week, same time each week. Device goes off line (iSCSI) for most of the day. Hoping for an Update Roll-Back.
  • You should disable chap authentication and just use iscsi initiator security. Then it will all be solved!
  • looneyM wrote:
    I opened a case this morning for the same issue. I do not see anything in the esxi hosts but I did see a bunch of release messages every 5 seconds in the log of the readynas. I have been having this issue since I installed the unit one month ago. Oddly I am only having a serious issue (total disconnect) on Fridays between 12:00 - 1:00 EST. The tech support agent today had me use the same time servers(factory setting is use netgears time servers) as the esxi hosts and we turned off Auto update checking feature. I will post back Friday with results. Netgear I hope you are listening I have purchased 4 of these for my customers and one will be installed this friday for a production enviornment.


    Hi Looeny, got any updates on this so far? Not sure if upgrading helped here.

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More