VMware Cloud Community
kreator
Contributor
Contributor
Jump to solution

Having dead path to iSCSI target

What I've had:

host 1: ESX3.5i

host 2: Solaris 10 (SunOS sun 5.10 Generic_138889-02 i86pc i386 i86pc)

2 ZFS shares on it - 1 iSCSI share and 1 NFS share.

What I did:

Upgraded to ESX4i

What I've for now:

Everything is working but iSCSI.

It's showing 'dead path' to me but vmkping to Sun host is ok.

tail of /var/log/messages here:

Jun 20 16:24:07 vmkernel: 0:01:22:15.659 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_TransportConnSendPdu: vmhba37:CH:0 T:0 CN:0: Failed to queue passthru request: No connection

Jun 20 16:24:07 iscsid: send_pdu failed rc-22

Jun 20 16:24:07 vmkernel: 0:01:22:15.659 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba37:CH:0 T:0 CN:0: iSCSIconnection is being marked "OFFLINE"

Jun 20 16:24:07 iscsid: Kernel reported iSCSI connection 1:0 error (1006) state (3)

Jun 20 16:24:07 vmkernel: 0:01:22:15.659 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess

Jun 20 16:24:07 vmkernel: 0:01:22:15.659 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StartConnection: vmhba37:CH:0 T:0 CN:0: iSCSI connection is being marked "ONLINE"

Jun 20 16:24:10 iscsid: connection1:0 is operational after recovery (2 attempts)

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StartConnection: Sess

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StartConnection: Conn

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_ConnSetupScsiResp: vmhba37:CH:0 T:0 CN:0: SCSI response underflow residual invalid: residual 0, expectedXferLen 0

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_ConnSetupScsiResp: Sess

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_ConnSetupScsiResp: Conn

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: vmhba37:CH:0 T:0 CN:0: Connection rx notifying failure: Residual Invalid. State=Online

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Sess

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Conn

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba37:CH:0 T:0 CN:0: Processing CLEANUP event

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess

Jun 20 16:24:10 vmkernel: 0:01:22:18.538 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn

Jun 20 16:24:10 vmkernel: 0:01:22:18.789 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_TransportConnSendPdu: vmhba37:CH:0 T:0 CN:0:Failed to queue passthru request: No connection

Jun 20 16:24:10 iscsid: send_pdu failed rc-22

Jun 20 16:24:10 vmkernel: 0:01:22:18.789 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba37:CH:0 T:0 CN:0: iSCSIconnection is being marked "OFFLINE"

Jun 20 16:24:10 iscsid: Kernel reported iSCSI connection 1:0 error (1006) state (3)

Jun 20 16:24:10 vmkernel: 0:01:22:18.789 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess

Jun 20 16:24:10 vmkernel: 0:01:22:18.789 cpu1:11899)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn

and so on.

What I did wrong and how can I correct this situation?

Tags (4)
Reply
0 Kudos
1 Solution

Accepted Solutions
paithal
VMware Employee
VMware Employee
Jump to solution

Out of curiosity, I went ahead and tried the solaris target. It appears there is an issue in solaris target before "solaris 10 update7". I tried u6 and I see the issue and it is clearly target problem.

...

iSCSI (SCSI Command)

Opcode: SCSI Command (0x01)

.0.. .... = I: Queued delivery

Flags: 0x81

1... .... = F: Final PDU in sequence

.0.. .... = R: No data will be read from target

..0. .... = W: No data will be written to target

.... .001 = Attr: Simple (0x01)

TotalAHSLength: 0x00

DataSegmentLength: 0x00000000

LUN: 0000000000000000

InitiatorTaskTag: 0xad010000

ExpectedDataTransferLength: 0x00000000

CmdSN: 0x000001ab

ExpStatSN: 0x000001ad

SCSI CDB Test Unit Ready

Opcode: Test Unit Ready (0x00)

Vendor Unique = 0, NACA = 0, Link = 0

...

iSCSI (SCSI Response)

Opcode: SCSI Response (0x21)

Flags: 0x82

...0 .... = o: No overflow of read part of bi-directional command

.... 0... = u: No underflow of read part of bi-directional command

.... .0.. = O: No residual overflow occurred

.... ..1. = U: Residual underflow occurred <<<<<=================

Response: Command completed at target (0x00)

Status: Good (0x00)

TotalAHSLength: 0x00

DataSegmentLength: 0x00000000

InitiatorTaskTag: 0xad010000

StatSN: 0x000001ad

ExpCmdSN: 0x000001ac

MaxCmdSN: 0x000001ea

ExpDataSN: 0x00000000

BidiReadResidualCount: 0x00000000

ResidualCount: 0x00000000

Request in: 10

Time from request: 0.001020000 seconds

SCSI Response (Test Unit Ready)

...

14:02:12:42.569 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_ConnSetupScsiResp: vmhba35:CH:0 T:2 CN:0: SCSI response underflow residual invalid: residual 0, expectedXferLen 0

14:02:12:42.584 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_ConnSetupScsiResp: Sess

14:02:12:42.602 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_ConnSetupScsiResp: Conn

14:02:12:42.614 cpu2:7026)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: vmhba35:CH:0 T:2 CN:0: Connection rx notifying failure: Residual Invalid. State=Online

14:02:12:42.628 cpu2:7026)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Sess

14:02:12:42.644 cpu2:7026)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Conn

14:02:12:42.655 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba35:CH:0 T:2 CN:0: Processing CLEANUP event

14:02:12:42.666 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess

14:02:12:42.683 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn

14:02:12:42.731 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_TransportConnSendPdu: vmhba35:CH:0 T:2 CN:0: Failed to queue passthru request: No connection

14:02:12:42.732 cpu3:4119)vmw_psp_fixed: psp_fixedSelectPathToActivateInt: Device "Unregistered Device" has no paths to use (APD).

14:02:12:42.745 cpu2:7026)iscsi_vmk: iscsivmk_SessionHandleLoggedInState: vmhba35:CH:0 T:2 CN:-1: Session state transitioned from 'Logged In' to 'In Recovery'

14:02:12:42.756 cpu3:4119)vmw_psp_fixed: psp_fixedSelectPathToActivateInt: Device "Unregistered Device" has no paths to use (APD).

14:02:12:42.770 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba35:CH:0 T:2 CN:0: iSCSI connection is being marked "OFFLINE"

14:02:12:42.782 cpu3:4119)NMP: nmp_DeviceUpdatePathStates: The PSP did not select a path to activate for NMP device 'Unregistered Device'.

14:02:12:42.794 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess

14:02:12:42.823 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn

In the response, 'U' bit should not be set. solaris 10 update7 seem to have resolve this issue.

View solution in original post

Reply
0 Kudos
8 Replies
kreator
Contributor
Contributor
Jump to solution

Looks like no one faced such a situation.

At least, I can say that ZFS-NFS shares working good with ESX4i.

Ok, will wait for somebody with configuration like mine..

Reply
0 Kudos
DSTAVERT
Immortal
Immortal
Jump to solution

I would go through the knowledgebase See if there is anything related.

-- David -- VMware Communities Moderator
Reply
0 Kudos
paithal
VMware Employee
VMware Employee
Jump to solution

Hmm, it is interesting that the target is reporting underflow eventhough there isn't any data transfer required. The actual problem part is residual = 0 with underflow reported. I would like to see the trace to see what SCSI cdb is this and what is the allocation length as per the cdb. I assume this is a solaris software target. Could you please get a network trace and upload ?.

Reply
0 Kudos
kreator
Contributor
Contributor
Jump to solution

I assume this is a solaris software target.

Yes. As I described above this is Solaris 10u5.

Could you please get a network trace and upload ?.

Sure, no problem! But (I'm sorry) which way I can get network trace in ESX?

This should be done on ESX host / vCenter server logs or on the Solaris file-server?

Reply
0 Kudos
paithal
VMware Employee
VMware Employee
Jump to solution

On ESX, if it is ESX4i, it can't be done with in ESX. If it is ESX 4.0, you will need to install additional software. I think it is probably easy done on solaris server ?.Is tcpdump/etherea/wireshark like software available for solaris ?.

Reply
0 Kudos
paithal
VMware Employee
VMware Employee
Jump to solution

BTW, is it solaris 10 U?? version you are using ?.

Reply
0 Kudos
paithal
VMware Employee
VMware Employee
Jump to solution

Out of curiosity, I went ahead and tried the solaris target. It appears there is an issue in solaris target before "solaris 10 update7". I tried u6 and I see the issue and it is clearly target problem.

...

iSCSI (SCSI Command)

Opcode: SCSI Command (0x01)

.0.. .... = I: Queued delivery

Flags: 0x81

1... .... = F: Final PDU in sequence

.0.. .... = R: No data will be read from target

..0. .... = W: No data will be written to target

.... .001 = Attr: Simple (0x01)

TotalAHSLength: 0x00

DataSegmentLength: 0x00000000

LUN: 0000000000000000

InitiatorTaskTag: 0xad010000

ExpectedDataTransferLength: 0x00000000

CmdSN: 0x000001ab

ExpStatSN: 0x000001ad

SCSI CDB Test Unit Ready

Opcode: Test Unit Ready (0x00)

Vendor Unique = 0, NACA = 0, Link = 0

...

iSCSI (SCSI Response)

Opcode: SCSI Response (0x21)

Flags: 0x82

...0 .... = o: No overflow of read part of bi-directional command

.... 0... = u: No underflow of read part of bi-directional command

.... .0.. = O: No residual overflow occurred

.... ..1. = U: Residual underflow occurred <<<<<=================

Response: Command completed at target (0x00)

Status: Good (0x00)

TotalAHSLength: 0x00

DataSegmentLength: 0x00000000

InitiatorTaskTag: 0xad010000

StatSN: 0x000001ad

ExpCmdSN: 0x000001ac

MaxCmdSN: 0x000001ea

ExpDataSN: 0x00000000

BidiReadResidualCount: 0x00000000

ResidualCount: 0x00000000

Request in: 10

Time from request: 0.001020000 seconds

SCSI Response (Test Unit Ready)

...

14:02:12:42.569 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_ConnSetupScsiResp: vmhba35:CH:0 T:2 CN:0: SCSI response underflow residual invalid: residual 0, expectedXferLen 0

14:02:12:42.584 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_ConnSetupScsiResp: Sess

14:02:12:42.602 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_ConnSetupScsiResp: Conn

14:02:12:42.614 cpu2:7026)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: vmhba35:CH:0 T:2 CN:0: Connection rx notifying failure: Residual Invalid. State=Online

14:02:12:42.628 cpu2:7026)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Sess

14:02:12:42.644 cpu2:7026)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Conn

14:02:12:42.655 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba35:CH:0 T:2 CN:0: Processing CLEANUP event

14:02:12:42.666 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess

14:02:12:42.683 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn

14:02:12:42.731 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_TransportConnSendPdu: vmhba35:CH:0 T:2 CN:0: Failed to queue passthru request: No connection

14:02:12:42.732 cpu3:4119)vmw_psp_fixed: psp_fixedSelectPathToActivateInt: Device "Unregistered Device" has no paths to use (APD).

14:02:12:42.745 cpu2:7026)iscsi_vmk: iscsivmk_SessionHandleLoggedInState: vmhba35:CH:0 T:2 CN:-1: Session state transitioned from 'Logged In' to 'In Recovery'

14:02:12:42.756 cpu3:4119)vmw_psp_fixed: psp_fixedSelectPathToActivateInt: Device "Unregistered Device" has no paths to use (APD).

14:02:12:42.770 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba35:CH:0 T:2 CN:0: iSCSI connection is being marked "OFFLINE"

14:02:12:42.782 cpu3:4119)NMP: nmp_DeviceUpdatePathStates: The PSP did not select a path to activate for NMP device 'Unregistered Device'.

14:02:12:42.794 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess

14:02:12:42.823 cpu2:7026)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn

In the response, 'U' bit should not be set. solaris 10 update7 seem to have resolve this issue.

Reply
0 Kudos
kreator
Contributor
Contributor
Jump to solution

Out of curiosity, I went ahead and tried the solaris target. It appears there is an issue in solaris target before "solaris 10 update7". I tried u6 and I see the issue and it is clearly target problem.

...

In the response, 'U' bit should not be set. solaris 10 update7 seem to have resolve this issue.

Thanks a lot of taking a part to resolve my problem - in deed, my release of Solaris is 10u5.

I just thought that if I changed not Solaris but only ESX it was ESX issue and you show me where is the problem!

As soon as I'll have possibility to upgrade SunOS to 10u7 in my fileserver I'll do it. Maybe next weekend - because people are working on it :smileygrin:

Tnx again! Smiley Happy

Reply
0 Kudos