VMware Cloud Community
fish6288
Enthusiast
Enthusiast
Jump to solution

ESXi 4 iSCSI Issue

I am having a problem with ESXi 4 that i cant figure out. I have enabled the iscsi feature and have pointed it to the correct place. But when i do a rescan for new storage it doesnt bring anything up. If i click on the "Paths" button it shows the Lun number correctly but under Status it shows as DEAD with a red diamond. What am i missing here? This is the ESXi 4 60-day trial version...im guessing it is licensed for iSCSI use. Anyone have any ideas?

Thanks

Reply
0 Kudos
32 Replies
paithal
VMware Employee
VMware Employee
Jump to solution

What is the FW version on FAS270 ?. Seems to me something related to ALUA. Can you upload the entire log ?.

Reply
0 Kudos
sbu77
Contributor
Contributor
Jump to solution

Yeah. I am suspecting the same and have been trying to research ALUA. Our FAS270's version is 7.0.6 (I know it's older, but upgrading is difficult for us at this point).

Apparently, ALUA option for igroup on filers is only available on 7.2.x and later...

Output of esxcfg-mpath:

# esxcfg-mpath -L

vmhba0:C0:T0:L0 state:active mpx.vmhba0:C0:T0:L0 vmhba0 0 0 0 NMP active local block.cciss/c0d0:0 block.0:0

vmhba33:C0:T0:L1 state:dead (no device) vmhba33 0 0 1 (unclaimed) dead san iqn.1998-01.com.vmware:myhost-11ff050d 00023d000001,iqn.1992-08.com.netapp:sn.84296088,t,1

vmhba1:C0:T0:L0 state:active mpx.vmhba1:C0:T0:L0 vmhba1 0 0 0 NMP active local ide.vmhba1 ide.0:0

Output of /var/log/vmkernel after doing esxcfg-rescan vmhba33:

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.258 cpu0:4104)ScsiScan: 839: Path 'vmhba33:C0:T0:L0': Vendor: 'NETAPP ' Model: 'LUN ' Rev: '0.2 '

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.258 cpu0:4104)ScsiScan: 842: Path 'vmhba33:C0:T0:L0': Type: 0x1f, ANSI rev: 4, TPGS: 0 (none)

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.258 cpu0:4104)ScsiScan: 105: Path 'vmhba33:C0:T0:L0': Peripheral qualifier 0x1 not supported

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.259 cpu0:4104)ScsiScan: 839: Path 'vmhba33:C0:T1:L0': Vendor: 'NETAPP ' Model: 'LUN ' Rev: '0.2 '

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.259 cpu0:4104)ScsiScan: 842: Path 'vmhba33:C0:T1:L0': Type: 0x1f, ANSI rev: 4, TPGS: 0 (none)

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.259 cpu0:4104)ScsiScan: 105: Path 'vmhba33:C0:T1:L0': Peripheral qualifier 0x1 not supported

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.307 cpu0:4103)ScsiScan: 839: Path 'vmhba33:C0:T0:L1': Vendor: 'NETAPP ' Model: 'LUN ' Rev: '0.2 '

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.307 cpu0:4103)ScsiScan: 842: Path 'vmhba33:C0:T0:L1': Type: 0x0, ANSI rev: 4, TPGS: 1 (implicit only)

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.308 cpu0:4105)WARNING: VMW_SATP_ALUA: satp_alua_getTargetPortInfo: Could not find relative target port ID for path "vmhba33:C0:T0:L1" - Not found (195887107)

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.308 cpu0:4105)WARNING: NMP: nmp_SatpClaimPath: SATP "VMW_SATP_ALUA" could not add path "vmhba33:C0:T0:L1" for device "Unregistered Device". Error Not found

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.308 cpu0:4105)WARNING: NMP: nmp_DeviceAlloc: nmp_AddPathToDevice failed Not found (195887107).

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.308 cpu0:4105)WARNING: NMP: nmp_DeviceAlloc: Could not allocate NMP device.

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.308 cpu0:4105)WARNING: ScsiPath: 3707: Plugin 'NMP' had an error (Not found) while claiming path 'vmhba33:C0:T0:L1'.Skipping the path.

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.308 cpu0:4105)ScsiClaimrule: 735: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba33:C0:T0:L1. Busy

Jun 3 14:49:40 myhost vmkernel: 0:01:57:26.308 cpu0:4105)ScsiClaimrule: 807: Error claiming path vmhba33:C0:T0:L1. Busy.

Jun 3 14:49:40 rocny-dlvfrnt02 vmkernel: 0:01:57:26.312 cpu1:4104)FSS: 3647: No FS driver claimed device '4a253d84-c8f9329b-72d2-001b78065040': Not supported

Note that my other host exibits the same behavior (common entries in the log starting with balded lines).

Reply
0 Kudos
fish6288
Enthusiast
Enthusiast
Jump to solution

I was wondering it it may be a FW / Data Ontap version problem with our NetApp box. We are going to upgrade our NetApp to Data Ontap 7.2 this weekend and see if it makes any difference. On the Support Hardware list from VMWare it shows our 3020 needs to be at 7.3 version but one of our shelves is not supported by that Data Ontap FW release version...so i am going to try the latest one our device will support and i will get back with the results.

Reply
0 Kudos
paithal
VMware Employee
VMware Employee
Jump to solution

It definately looks like ALUA issue. "Peripheral qualifier 0x1 not supported" is OK, I guess you don't have LUN 0 configured/mapped. I have a FAS270 with ONTAP 7.2 and ALUA enabled and it works great with ESX4. Looks like 7.0.6 reports as ALUA but doesn't provide TPG info in the page 0x83 inquiry. You should either upgrade the FW to newer one or disable ALUA on the igroup.

Reply
0 Kudos
sbu77
Contributor
Contributor
Jump to solution

Thanks paithal! Any idea how to disable ALUA on the igroup in 7.0.6 without upgrading? I don't believe that there is an "alua" option for "igroup set".

Reply
0 Kudos
paithal
VMware Employee
VMware Employee
Jump to solution

I haven't worked with 7.0.6. On 7.2, the command on to set ALUA is 'igroup set <igroup name> alua no'. Can you see what 'igroup show -v <igroup name>' gives ?.

The other quick workaround would be to delete the ALUA claim rule within ESX, i.e esxcli nmp satp deleterule --satp VMW_SATP_ALUA --vendor NETAPP --option tpgs_on. I think this may not be persistent across reboots.

Reply
0 Kudos
sbu77
Contributor
Contributor
Jump to solution

OK. Looks like ALUA is indeed the problem here:

Running "esxcli nmp satp deleterule --satp VMW_SATP_ALUA --vendor NETAPP --option tpgs_on" took care of the issue. The path is "alive" now.

Unfortunately, it doesn't look like 7.0.6 will allow me to manipulate alua settings:

igroup show -v GROUP

vSphere (iSCSI):

OS Type: vmware

Member: iqn.1998-01.com.vmware:myhost-11ff050d (logged in on: i

swta)

igroup set GROUP alua no

igroup set: Invalid initiator group attribute type

Also, it looks like deleting satp rule survived the reboot.

Any possible issues with deleting this rule from the ESX host in production?

My guess, the best option would be to try to upgrade the filer FW...

Thanks again for your help, paithal!!

fish6288
Enthusiast
Enthusiast
Jump to solution

The ESX Command also worked for me, i can now access the NetApps device Lun.

Thanks for the help on this.

Reply
0 Kudos
kertofer
Enthusiast
Enthusiast
Jump to solution

I actually just ran into this myself here with a FAS270 on OnTap 7.0.4 and this command worked for me, and once we did that we went ahead and rebooted the ESX server to see if the command persisted and when the server came back up then we still able to access all the LUN's, which leads me to believe that this is in fact held across reboots. My thought is that the best answer is to use this only as a stopgap, and get upgraded to 7.2.X or better as soon as possible.

Reply
0 Kudos
brimar5485
Contributor
Contributor
Jump to solution

For new vSphere ESX 4.1 try this command because the "--option" change for "--claim-option" use the ...

esxcli nmp satp deleterule --satp VMW_SATP_ALUA --claim-option tpgs_on

For listing the rules associates with "VMW_SATP_ALUA" try this command ...

esxcli nmp satp listrules -s VMW_SATP_ALUA

Reply
0 Kudos
Partydude
Contributor
Contributor
Jump to solution

Thank you all, one and a half years on, this helped me resolve the "dead paths" issue on ESX 4.1 but on FCP. I disabled the ALUA using "igroup set igroupname alua no" on all iGroups on both our NetApp controllers and after rescanning the problematic ESX, all the dead paths disappeared and I could remount the LUNs. The environment has NetApp Ontap 7.3.2 and ESX 4.1 with FC connectivity.

After I restarted an ESX host, the host could not map any LUNs, including a NetApp datastore which hosted the VMs, hence caused a downtime of about 3 hours until the ALUA was disabled.

Reply
0 Kudos
PeteLong
Contributor
Contributor
Jump to solution

- Cheers All - This helped!!

Enabling Tape Drives on vSphere Hosts

Pete

PeteNetLive

Reply
0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

Sorry to dig up this really old thread but I've got a NetApp FAS250 and a ESXi 5.0 hypervisor and have hte 'dead path' issue even though windows hosts can connect to it fine. The firmware version is 7.0.6. There is no way to upgrade due to a the maintenance contract having lapsed many moons ago...

I think the issue is simliar to what is happening here. The solution

esxcli nmp satp deleterule --satp VMW_SATP_ALUA --claim-option tpgs_on

For listing the rules associates with "VMW_SATP_ALUA" try this command ...

esxcli nmp satp listrules -s VMW_SATP_ALUA

Does not help me -- I get this when I try it

# esxcli nmp satp listrules -s VMW_SATP_ALUA
Error: Unknown command or namespace nmp satp listrules

I'm guessing this is because I'm using ESXi and not ESX like the rest of you? I know hte FAS250 is outdated but I sure as hell ain't throwing out 7 x 10,000RPM FC disks!

How can I disable ALUA on my ESXi 5.0 installation?

VMkernel log after running rescan operation:

2012-04-11T03:21:16.982Z cpu3:2667)usb-storage: detected SCSI revision number 0 on vmhba32
2012-04-11T03:21:16.982Z cpu3:2667)usb-storage: patching inquiry data to change SCSI revision number from 0 to 2 on vmhba32
2012-04-11T03:21:16.982Z cpu6:2173)ScsiUid: 273: Path 'vmhba32:C0:T0:L0' does not support VPD Device Id page.
2012-04-11T03:21:16.982Z cpu6:2173)VMWARE SCSI Id: Could not get disk id for vmhba32:C0:T0:L0
2012-04-11T03:21:16.983Z cpu3:2820)WARNING: VMW_SATP_ALUA: satp_alua_getTargetPortInfo:79:Could not find relative target port ID for path "vmhba34:C0:T1:L10" - Not found (195887107)
2012-04-11T03:21:16.983Z cpu3:2820)WARNING: NMP: nmp_SatpClaimPath:2093:SATP "VMW_SATP_ALUA" could not add  path "vmhba34:C0:T1:L10" for device "Unregistered Device". Error Not found
2012-04-11T03:21:16.983Z cpu3:2820)WARNING: NMP: nmp_DeviceAlloc:1228:nmp_AddPathToDevice failed Not found (195887107).
2012-04-11T03:21:16.983Z cpu3:2820)WARNING: NMP: nmp_DeviceAlloc:1237:Could not allocate NMP device.
2012-04-11T03:21:16.983Z cpu3:2820)WARNING: ScsiPath: 4550: Plugin 'NMP' had an error (Not found) while claiming path 'vmhba34:C0:T1:L10'.Skipping the path.
2012-04-11T03:21:16.983Z cpu3:2820)ScsiClaimrule: 1329: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba34:C0:T1:L10. Busy
2012-04-11T03:21:16.983Z cpu3:2820)ScsiClaimrule: 1554: Error claiming path vmhba34:C0:T1:L10. Busy.
2012-04-11T03:21:16.989Z cpu4:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:16.999Z cpu4:161128)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x1a (0x4124007c5400) to dev "mpx.vmhba0:C0:T0:L0" on path "vmhba0:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.Act:NONE
2012-04-11T03:21:16.999Z cpu4:161128)ScsiDeviceIO: 2316: Cmd(0x4124007c5400) 0x1a, CmdSN 0xa91a to dev "mpx.vmhba0:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2012-04-11T03:21:16.999Z cpu4:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.024Z cpu5:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.027Z cpu5:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.030Z cpu5:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.033Z cpu5:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.039Z cpu4:2820)FSS: 4333: No FS driver claimed device 'mpx.vmhba0:C0:T0:L0': Not supported
2012-04-11T03:21:17.039Z cpu4:2820)Vol3: 647: Couldn't read volume header from control: Invalid handle
2012-04-11T03:21:17.039Z cpu4:2820)FSS: 4333: No FS driver claimed device 'control': Not supported
2012-04-11T03:21:17.124Z cpu0:2820)VC: 1449: Device rescan time 40 msec (total number of devices 6)
2012-04-11T03:21:17.124Z cpu0:2820)VC: 1452: Filesystem probe time 77 msec (devices probed 5 of 6)
2012-04-11T03:21:17.278Z cpu0:2820)WARNING: VMW_SATP_ALUA: satp_alua_getTargetPortInfo:79:Could not find relative target port ID for path "vmhba34:C0:T1:L10" - Not found (195887107)
2012-04-11T03:21:17.278Z cpu0:2820)WARNING: NMP: nmp_SatpClaimPath:2093:SATP "VMW_SATP_ALUA" could not add  path "vmhba34:C0:T1:L10" for device "Unregistered Device". Error Not found
2012-04-11T03:21:17.278Z cpu0:2820)WARNING: NMP: nmp_DeviceAlloc:1228:nmp_AddPathToDevice failed Not found (195887107).
2012-04-11T03:21:17.278Z cpu0:2820)WARNING: NMP: nmp_DeviceAlloc:1237:Could not allocate NMP device.
2012-04-11T03:21:17.278Z cpu0:2820)WARNING: ScsiPath: 4550: Plugin 'NMP' had an error (Not found) while claiming path 'vmhba34:C0:T1:L10'.Skipping the path.
2012-04-11T03:21:17.278Z cpu0:2820)ScsiClaimrule: 1329: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba34:C0:T1:L10. Busy
2012-04-11T03:21:17.278Z cpu0:2820)ScsiClaimrule: 1554: Error claiming path vmhba34:C0:T1:L10. Busy.
2012-04-11T03:21:17.284Z cpu0:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.304Z cpu0:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.351Z cpu7:2820)Vol3: 647: Couldn't read volume header from control: Invalid handle
2012-04-11T03:21:17.351Z cpu7:2820)FSS: 4333: No FS driver claimed device 'control': Not supported
2012-04-11T03:21:17.369Z cpu7:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.372Z cpu7:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.375Z cpu7:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.378Z cpu7:2820)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:17.384Z cpu7:2820)FSS: 4333: No FS driver claimed device 'mpx.vmhba0:C0:T0:L0': Not supported
2012-04-11T03:21:17.387Z cpu7:2820)VC: 1449: Device rescan time 27 msec (total number of devices 6)
2012-04-11T03:21:17.387Z cpu7:2820)VC: 1452: Filesystem probe time 77 msec (devices probed 5 of 6)
2012-04-11T03:21:18.917Z cpu1:2821)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:18.921Z cpu1:2821)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:18.965Z cpu1:2821)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:18.968Z cpu1:2821)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:18.971Z cpu5:2821)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:18.974Z cpu5:2821)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2012-04-11T03:21:18.980Z cpu1:2821)FSS: 4333: No FS driver claimed device 'mpx.vmhba0:C0:T0:L0': Not supported
2012-04-11T03:21:19.025Z cpu4:2821)Vol3: 647: Couldn't read volume header from control: Invalid handle
2012-04-11T03:21:19.025Z cpu4:2821)FSS: 4333: No FS driver claimed device 'control': Not supported
2012-04-11T03:21:19.046Z cpu1:2821)VC: 1449: Device rescan time 55 msec (total number of devices 6)
2012-04-11T03:21:19.046Z cpu1:2821)VC: 1452: Filesystem probe time 78 msec (devices probed 5 of 6)

Message was edited by: breakaway9000 -- added Logs

Reply
0 Kudos