Having iSCSI issues, specifically timing out when attempting to connect to the SAN via 10gig.
See this in the vmkernel log:
Apr 18 10:43:19 devesx91 vmkernel: 0:00:27:21.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Logical device "naa.600144f02860c80000004dac395c0001": awaiting fast path state update...
Apr 18 10:43:55 devesx91 vmkernel: 0:00:27:57.192 cpu12:4278)ScsiDeviceIO: 1672: Command 0x28 to device "naa.600144f02860c80000004dac395c0001" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Apr 18 10:43:55 devesx91 vmkernel: 0:00:27:57.192 cpu12:4278)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "naa.600144f02860c80000004dac395c0001" is blocked. Not starting I/O from device.
Apr 18 10:43:55 devesx91 vmkernel: 0:00:27:57.195 cpu0:4096)VMNIX: VMKFS: 2132: timed out
Apr 18 10:43:56 devesx91 vmkernel: 0:00:27:58.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world restore device "naa.600144f02860c80000004dac395c0001" - no more commands to retry
Apr 18 10:43:56 devesx91 vmkernel: 0:00:27:58.182 cpu14:4305)WARNING: NMP: nmp_IssueCommandToDevice: I/O could not be issued to device "naa.600144f02860c80000004dac395c0001" due to Not found
Apr 18 10:43:56 devesx91 vmkernel: 0:00:27:58.182 cpu14:4305)WARNING: NMP: nmp_DeviceRetryCommand: Device "naa.600144f02860c80000004dac395c0001": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
Apr 18 10:43:56 devesx91 vmkernel: 0:00:27:58.182 cpu14:4305)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "naa.600144f02860c80000004dac395c0001" is blocked. Not starting I/O from device.
Apr 18 10:43:57 devesx91 vmkernel: 0:00:27:59.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "naa.600144f02860c80000004dac395c0001" - issuing command 0x41027f9d2840
Apr 18 10:43:57 devesx91 vmkernel: 0:00:27:59.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "naa.600144f02860c80000004dac395c0001" - failed to issue command due to Not found (APD), try again...
Apr 18 10:43:57 devesx91 vmkernel: 0:00:27:59.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Logical device "naa.600144f02860c80000004dac395c0001": awaiting fast path state update...
Apr 18 10:43:58 devesx91 vmkernel: 0:00:27:59.659 cpu12:4278)ScsiDeviceIO: 1672: Command 0x28 to device "naa.600144f02860c80000004dac395c0001" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Apr 18 10:43:58 devesx91 vmkernel: 0:00:27:59.660 cpu0:4096)VMNIX: VMKFS: 2132: timed out
And from the vmkiscsidb:
011-04-18-10:25:35: iscsid: Login Success: iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)
2011-04-18-10:25:35: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) is operational
2011-04-18-10:25:35: iscsid: DISCOVERY: Pending=0 Failed=0
2011-04-18-10:25:46: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) Nop-out timeout after 10 sec in state (3).
2011-04-18-10:25:49: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) has recovered (2 attempts)
2011-04-18-10:27:07: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) Nop-out timeout after 10 sec in state (3).
2011-04-18-10:27:10: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) has recovered (2 attempts)
2011-04-18-10:27:37: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) Nop-out timeout after 10 sec in state (3).
2011-04-18-10:27:40: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) has recovered (2 attempts)
2011-04-18-10:28:06: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) Nop-out timeout after 10 sec in state (3).
2011-04-18-10:28:09: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) has recovered (2 attempts)
2011-04-18-10:28:35: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1) (T0 C0)) Nop-out timeout after 10 sec in state (3).
dmesg on iscsi vmnic as well:
[ 695.933187] end_request: I/O error, dev sdc, sector 0
[ 695.937820] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.937821] end_request: I/O error, dev sdc, sector 0
[ 695.942416] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.942417] end_request: I/O error, dev sdc, sector 0
[ 695.947074] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.947075] end_request: I/O error, dev sdc, sector 0
[ 695.951703] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.951705] end_request: I/O error, dev sdc, sector 0
[ 695.956309] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.956310] end_request: I/O error, dev sdc, sector 0
[ 695.960931] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.960932] end_request: I/O error, dev sdc, sector 0
[ 695.965590] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.965591] end_request: I/O error, dev sdc, sector 0
[ 695.970195] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.970197] end_request: I/O error, dev sdc, sector 0
[ 695.974825] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.974826] end_request: I/O error, dev sdc, sector 0
[ 695.979996] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.979997] end_request: I/O error, dev sdc, sector 0
[ 695.984593] sd 8:0:1:0: SCSI error: return code = 0x00020000
[ 695.984595] end_request: I/O error, dev sdc, sector 0
Anyone have any ideas?
Also when you disable and re-enable iSCSI it retains the old settings from when you had it enabled. I find that you have to rename or remove the vmkiscsi files to get a 'clean state.' PITA if you ask me.
Have you followed the recommenced practices for your kind of storage?
Especially for multipath?
Andre
Yes and at this point no multipathing is configured.
Networking is stable during this issues?
Try to keep a vmkping running on the tagert IP.
Andre
look like one of your luns is giving you problems...
question is how many luns do you have?
If its more than one then you might want to check your storage.
Also I found I needed to clear out the static entries for the iscsi controlller then rescan before it clreared out stuff.
Have you also done the basic...
you know the esxcli swiscsi add command?
are all you vmnic added to your swiscsi initiator?
Its fixed. Jumbo frames wasn't enabled on the vSwitch which housed the iSCSI vmkernel, once I enabled it all is well again.
Thanks for the suggestions.
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch4 32 3 32 9000 vmnic3
PortGroup Name VLAN ID Used Ports Uplinks
iSCSI 0 1 vmnic3
[root@devesx93 ~]# esxcfg-vmknic -l
Interface Port Group/DVPort IP Family IP Address Netmask Broadcast MAC Address MTU TSO MSS Enabled Type
vmk0 iSCSI IPv4 192.168.4.50 255.255.255.0 192.168.4.255 00:50:56:7d:0c:bd 9000 65535 true STATIC
Easily overlooked if you ask me.