NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

jeremyotten's avatar
jeremyotten
Aspirant
Mar 22, 2011

Readynas 4200 losing iSCSI connection for 5 secs (15072737)

Hello,

we have 2x vmware ESXi 4.1 u1 servers with 4 GB paths each to the iSCSI SAN (ReadyNAS 4200).

We now have the problem that we sometimes loose the connection and the entire vmware environment hangs for 5 seconds. We let vmware analyse the logs and they have come up with this:

Hello Jeremy,

vm-support logs show that ESXi host lost access to NetGear ReadyNAS4200 iSCSI array :

messages.2:Mar 21 14:39:26 vmkernel: 4:19:08:18.685 cpu11:4844)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba37:CH:1 T:0 CN:0: iSCSI connection is being marked "OFFLINE" (Event:6) messages.2:Mar 21 14:39:26 vmkernel: 4:19:08:18.685 cpu11:4844)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess [ISID: 00023d000003 TARGET: iqn.2010-12.BIGFOOT:vmware.lun0 TPGT: 1 TSIH: 0] messages.2:Mar 21 14:39:26 vmkernel: 4:19:08:18.685 cpu11:4844)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn [CID: 0 L: 192.168.0.52:51125 R: 192.168.0.3:3260]

messages.1:Mar 21 14:39:28 vmkernel: 4:19:08:20.654 cpu7:4103)NMP: nmp_CompleteCommandForPath: Command 0x28 (0x41027f1ca540) to NMP device "naa.60014052e10897000c6d002000000000" failed on physical path "vmhba37:C1:T0:L1" H:0x2 D:0x0 P:0x0 Possible sense data: 0 messages.1:Mar 21 14:39:28 vmkernel: 4:19:08:20.654 cpu7:4103)ScsiDeviceIO: 1672: Command 0x28 to device "naa.60014052e10897000c6d002000000000" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
messages.1:Mar 21 14:39:28 vmkernel: 4:19:08:20.676 cpu1:4097)NMP: nmp_CompleteCommandForPath: Command 0x12 (0x41027f9cc840) to NMP device "naa.60014052e10897000c6d003000000000" failed on physical path "vmhba37:C1:T0:L2" H:0x2 D:0x0 P:0x0 Possible sense data: 0 messages.1:Mar 21 14:39:28 vmkernel: 4:19:08:20.676 cpu1:4097)ScsiDeviceIO: 1672: Command 0x12 to device "naa.60014052e10897000c6d003000000000" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0.


messages.1:Mar 21 14:39:31 vmkernel: 4:19:08:22.974 cpu2:4844)WARNING: iscsi_vmk: iscsivmk_StartConnection: vmhba37:CH:1 T:0 CN:0: iSCSI connection is being m arked "ONLINE"

On the other hand logs also show that iSCSI LUNs are using MRU as path policy and the Storage Array is detected as VMW_SATP_ALUA.
This is something that nees to be validated with your SAN Vendor as our HCL states that storage array type should be VMW_SATP_DEFAULT_AA and Path Policy should be VMW_PSP_FIXED.
Please engage your SAN Vendor to have this validated by them.



Please assist in fixing this!

We are on 4.2.15-SP1

19 Replies

  • Interestingly, the connection to an NFS share on the other LAN card drops at the same time.
  • Switch is a Cisco 2960g (48 ports all Gb), RN and ESX connected to the switch.

    ESX:
    NIC1 - Trunked for management and Virtual Machines
    NIC2 - not connected
    NIC3 - Access port VLAN 70 iSCSI only
    NIC4 - Access port VLAN 70 iSCSI only

    RN4200:
    NIC 1 - Access port VLAN 70 iSCSI only
    NIC 2 - access port VLAN 144

    When I say Access port, it's like this:

    interface g0/33
    switchport mode access
    switchport access vlan 70


    And the ESX is set to no VLAN.

    There's no router linking VLAN 70 to 40.

    RAIDiator is 4.2.19

    Thanks for your tips so far, you're just helping me confirm I haven't done anything stupid. :)
  • Just using the software, I think that the cards (they're broadcomm) can be configured as initiators, but I started using the software initiator before I discovered that.
  • Software is fine.

    Just read the PDF link and do the things they say. Also on your switches make an LACP Trunk to bundle NICS.

    We have 2x Netgeat GS724TS stackable switches.

    Each ESX hosts has 4 nics dedicated for iSCSI. Traffic is over these links because the iSCSI initiators and the iSCSI SAN (Readynas) are in a different SUBNET then the VM's

    2 nic of 1 host go to switch 1 the other 2 nics go to switch 2. Because the switches are stackable I can make 1 LACP link of 4 ports over the 2 switches.. maximizing the throughput!

    Also we are on Radiator 4.2.16!!
  • Now I think I've tried everything...

    Multiple cards,
    LACP link on the switch

    Multiple vmk interfaces


    It's still doing it.
    Is nobody else seeing this with the 4200?

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More