VMware Cloud Community
max08
Contributor
Contributor

iSCSI issues

Having iSCSI issues, specifically timing out when attempting to connect to the SAN via 10gig.

See this in the vmkernel log:

Apr 18 10:43:19 devesx91 vmkernel: 0:00:27:21.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Logical device "naa.600144f02860c80000004dac395c0001": awaiting fast path state update...
Apr 18 10:43:55 devesx91 vmkernel: 0:00:27:57.192 cpu12:4278)ScsiDeviceIO: 1672: Command 0x28 to device "naa.600144f02860c80000004dac395c0001" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Apr 18 10:43:55 devesx91 vmkernel: 0:00:27:57.192 cpu12:4278)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "naa.600144f02860c80000004dac395c0001" is blocked. Not starting I/O from device.
Apr 18 10:43:55 devesx91 vmkernel: 0:00:27:57.195 cpu0:4096)VMNIX: VMKFS: 2132: timed out
Apr 18 10:43:56 devesx91 vmkernel: 0:00:27:58.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world restore device "naa.600144f02860c80000004dac395c0001" - no more commands to retry
Apr 18 10:43:56 devesx91 vmkernel: 0:00:27:58.182 cpu14:4305)WARNING: NMP: nmp_IssueCommandToDevice: I/O could not be issued to device "naa.600144f02860c80000004dac395c0001" due to Not found
Apr 18 10:43:56 devesx91 vmkernel: 0:00:27:58.182 cpu14:4305)WARNING: NMP: nmp_DeviceRetryCommand: Device "naa.600144f02860c80000004dac395c0001": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
Apr 18 10:43:56 devesx91 vmkernel: 0:00:27:58.182 cpu14:4305)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "naa.600144f02860c80000004dac395c0001" is blocked. Not starting I/O from device.
Apr 18 10:43:57 devesx91 vmkernel: 0:00:27:59.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "naa.600144f02860c80000004dac395c0001" - issuing command 0x41027f9d2840
Apr 18 10:43:57 devesx91 vmkernel: 0:00:27:59.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "naa.600144f02860c80000004dac395c0001" - failed to issue command due to Not found (APD), try again...
Apr 18 10:43:57 devesx91 vmkernel: 0:00:27:59.182 cpu14:4305)WARNING: NMP: nmpDeviceAttemptFailover: Logical device "naa.600144f02860c80000004dac395c0001": awaiting fast path state update...
Apr 18 10:43:58 devesx91 vmkernel: 0:00:27:59.659 cpu12:4278)ScsiDeviceIO: 1672: Command 0x28 to device "naa.600144f02860c80000004dac395c0001" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Apr 18 10:43:58 devesx91 vmkernel: 0:00:27:59.660 cpu0:4096)VMNIX: VMKFS: 2132: timed out

And from the vmkiscsidb:

011-04-18-10:25:35: iscsid: Login Success: iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)
2011-04-18-10:25:35: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) is operational
2011-04-18-10:25:35: iscsid: DISCOVERY: Pending=0 Failed=0
2011-04-18-10:25:46: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) Nop-out timeout after 10 sec in state (3).
2011-04-18-10:25:49: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) has recovered (2 attempts)
2011-04-18-10:27:07: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) Nop-out timeout after 10 sec in state (3).
2011-04-18-10:27:10: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) has recovered (2 attempts)
2011-04-18-10:27:37: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) Nop-out timeout after 10 sec in state (3).
2011-04-18-10:27:40: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) has recovered (2 attempts)
2011-04-18-10:28:06: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) Nop-out timeout after 10 sec in state (3).
2011-04-18-10:28:09: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) has recovered (2 attempts)
2011-04-18-10:28:35: iscsid: connection 1:0 (iqn.1986-03.com.sun:02:ace16ff3-2aae-c4cc-c09e-88a7bea3088e if=default addr=192.168.4.2:3260 (TPGT:2 ISID:0x1)  (T0 C0)) Nop-out timeout after 10 sec in state (3).

dmesg on iscsi vmnic as well:

[  695.933187] end_request: I/O error, dev sdc, sector 0
[  695.937820] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.937821] end_request: I/O error, dev sdc, sector 0
[  695.942416] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.942417] end_request: I/O error, dev sdc, sector 0
[  695.947074] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.947075] end_request: I/O error, dev sdc, sector 0
[  695.951703] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.951705] end_request: I/O error, dev sdc, sector 0
[  695.956309] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.956310] end_request: I/O error, dev sdc, sector 0
[  695.960931] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.960932] end_request: I/O error, dev sdc, sector 0
[  695.965590] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.965591] end_request: I/O error, dev sdc, sector 0
[  695.970195] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.970197] end_request: I/O error, dev sdc, sector 0
[  695.974825] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.974826] end_request: I/O error, dev sdc, sector 0
[  695.979996] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.979997] end_request: I/O error, dev sdc, sector 0
[  695.984593] sd 8:0:1:0: SCSI error: return code = 0x00020000
[  695.984595] end_request: I/O error, dev sdc, sector 0

Anyone have any ideas?

Also when you disable and re-enable iSCSI it retains the old settings from when you had it enabled.  I find that you have to rename or remove the vmkiscsi files to get a 'clean state.'  PITA if you ask me.

0 Kudos
5 Replies
AndreTheGiant
Immortal
Immortal

Have you followed the recommenced practices for your kind of storage?

Especially for multipath?

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
max08
Contributor
Contributor

Yes and at this point no multipathing is configured.

0 Kudos
AndreTheGiant
Immortal
Immortal

Networking is stable during this issues?

Try to keep a vmkping running on the tagert IP.

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
opbz
Hot Shot
Hot Shot

look like one of your luns is giving you problems...

question is how many luns do you have?

If its more than one then you might want to check your storage.

Also I found I needed to clear out the static entries for the iscsi controlller then rescan before it clreared out stuff.

Have you also done the basic...

you know the esxcli swiscsi add command?

are all you vmnic added to your swiscsi initiator?

0 Kudos
max08
Contributor
Contributor

Its fixed.  Jumbo frames wasn't enabled on the vSwitch which housed the iSCSI vmkernel, once I enabled it all is well again.

Thanks for the suggestions.

Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks  
vSwitch4         32          3           32                9000    vmnic3   

  PortGroup Name        VLAN ID  Used Ports  Uplinks  
  iSCSI                 0        1           vmnic3   

[root@devesx93 ~]# esxcfg-vmknic -l
Interface  Port Group/DVPort   IP Family IP Address                              Netmask         Broadcast       MAC Address       MTU     TSO MSS   Enabled Type               
vmk0       iSCSI               IPv4      192.168.4.50                            255.255.255.0   192.168.4.255   00:50:56:7d:0c:bd 9000    65535     true    STATIC

Easily overlooked if you ask me.

0 Kudos