After extensive analysis of the ESX 3.0 iSCSI initiator and various iSCSI target products, I have developed a patch for the vmkernel iSCSI module.
If you are using a mainstream iSCSI target product and are encountering the following error message, you may wish to try this patch:
In a nutshell, here's the issue: ESX uses the code base from an older version of Cisco's iSCSI initiator for Linux (version 3.4.2). When a user goes through the "Add Storage" wizard in the VI Client, ESX goes ahead and partitions the iSCSI LUN, formats it for VMFS, and then attempts to apply a SCSI reservation on that LUN.
(The SCSI RESERVE and RELEASE commands are inherently part of any multi-initiator or cluster environment to arbitrate access to data on a shared LUN).
However, the older Cisco 3.4.2 iSCSI code arbitrarily groups the RESERVE command in with other SCSI commands like WRITE, FORMAT, COPY, etc. Because of this grouping, the RESERVE command actually gets sent to the iSCSI target with the iSCSI W-flag (Write flag) set. This implies that the initiator will be sending data (called a "data-out" phase) to the target. However, embedded in the RESERVE command packet, the initiator actually specifies that zero data bytes are to be written (called a zero data-out phase). This combination of the W-flag and expected data transfer of zero bytes confuses the iSCSI target, so the target responds with an error message (usually CHECK CONDITION).
This error message then gets passed up the stack from iSCSI to VMFS3. VMFS3 recognizes that a SCSI reservation could not be applied to the iSCSI LUN, so the error dialog (above) is displayed.
Using the vmkernel module source code supplied in the ESX 3.0 open-source package, I was successful in identifying and correcting the behaviour of the RESERVE command by ensuring that the W-flag is not set upon transmission of the iSCSI packet.
I have enabled debugging in my iscsi_mod.o patch module (which gets sent to /var/log/vmkernel) but upon continued testing, I will remove it and re-release the patch.
Since this is not an official VMware patch, I would suggest you only apply this patch for the purpose of verifying your iSCSI SAN's interoperability with ESX 3.0, and for internal testing purposes.
To apply the patch, rename the iscsi_mod.o file in the /usr/lib/vmware/vmkmod/ directory (ie. iscsi_mod.o.BAK), and copy my patched iscsi_mod.o in its place. Ensure that the file attributes are set correctly by issuing a chmod 444 iscsi_mod.o At that point, reboot your ESX server and test connectivity to your IP SAN.
Please note that this patch may not work for all iSCSI target implementations for which the above dialog error message occurs. Even after applying this patch, I have discovered that some iSCSI target implementations simply do not respect the RESERVE and RELEASE commands. For example, Novell NetWare 6.5's iSCSI target does not support RESERVE / RELEASE at all.
You can download the patch from here
I will be following this thread... let me know how it works for you!
Regards, Paul.
If you are using a mainstream iSCSI target product and are encountering the following error message, you may wish to try this patch:
In a nutshell, here's the issue: ESX uses the code base from an older version of Cisco's iSCSI initiator for Linux (version 3.4.2). When a user goes through the "Add Storage" wizard in the VI Client, ESX goes ahead and partitions the iSCSI LUN, formats it for VMFS, and then attempts to apply a SCSI reservation on that LUN.
(The SCSI RESERVE and RELEASE commands are inherently part of any multi-initiator or cluster environment to arbitrate access to data on a shared LUN).
However, the older Cisco 3.4.2 iSCSI code arbitrarily groups the RESERVE command in with other SCSI commands like WRITE, FORMAT, COPY, etc. Because of this grouping, the RESERVE command actually gets sent to the iSCSI target with the iSCSI W-flag (Write flag) set. This implies that the initiator will be sending data (called a "data-out" phase) to the target. However, embedded in the RESERVE command packet, the initiator actually specifies that zero data bytes are to be written (called a zero data-out phase). This combination of the W-flag and expected data transfer of zero bytes confuses the iSCSI target, so the target responds with an error message (usually CHECK CONDITION).
This error message then gets passed up the stack from iSCSI to VMFS3. VMFS3 recognizes that a SCSI reservation could not be applied to the iSCSI LUN, so the error dialog (above) is displayed.
Using the vmkernel module source code supplied in the ESX 3.0 open-source package, I was successful in identifying and correcting the behaviour of the RESERVE command by ensuring that the W-flag is not set upon transmission of the iSCSI packet.
I have enabled debugging in my iscsi_mod.o patch module (which gets sent to /var/log/vmkernel) but upon continued testing, I will remove it and re-release the patch.
Since this is not an official VMware patch, I would suggest you only apply this patch for the purpose of verifying your iSCSI SAN's interoperability with ESX 3.0, and for internal testing purposes.
To apply the patch, rename the iscsi_mod.o file in the /usr/lib/vmware/vmkmod/ directory (ie. iscsi_mod.o.BAK), and copy my patched iscsi_mod.o in its place. Ensure that the file attributes are set correctly by issuing a chmod 444 iscsi_mod.o At that point, reboot your ESX server and test connectivity to your IP SAN.
Please note that this patch may not work for all iSCSI target implementations for which the above dialog error message occurs. Even after applying this patch, I have discovered that some iSCSI target implementations simply do not respect the RESERVE and RELEASE commands. For example, Novell NetWare 6.5's iSCSI target does not support RESERVE / RELEASE at all.
You can download the patch from here
I will be following this thread... let me know how it works for you!
Regards, Paul.







