VMware Cloud Community
jonheese
Contributor
Contributor
Jump to solution

iSCSI treating new target LUNs as additional paths for existing target LUNs

Hi,

I have an ESXi 5.5.0 (build 1331820) host that I'm trying to roll out some new iSCSI LUNs to, but I'm having trouble with it that I'm hoping someone can help me resolve.

The iSCSI back-end server is running CentOS 6.5 with tgtd providing the iSCSI targets.  There are four iSCSI targets in production, with an additional two being created and used here.  When I re-scan the iSCSI environment after adding the two new targets, I still only see 4 disks in the Details pane of the iSCSI adapter under Configuration -> Storage Adapters.

Additionally, two of the LUNs stop working, and further inspection indicates that the new iSCSI targets are being considered as separate paths for the two broken LUNs.  That is:

Existing LUNs:

iqn.2014-06.storage0.jonheese.local:datastore0

iqn.2014-06.storage1.jonheese.local:datastore1

iqn.2014-06.storage2.jonheese.local:datastore2

iqn.2014-06.storage3.jonheese.local:datastore3

New LUNs:

iqn.2014-06.storage4.jonheese.local:datastore4

iqn.2014-06.storage5.jonheese.local:datastore5

From the "Manage Paths" dialog for each of the disks found, I see that "datastore4" shows up as an additional path for "datastore0" and "datastore5" shows up as an additional path for "datastore1"-- even though the target names are clearly different.

So can anyone tell me why the vSphere iSCSI client is treating distinct iSCSI target LUNs as being multiple paths for the same target?  I've compared the tgtd configuration between the original 4 LUNs and the 2 new LUNs, and everything looks correct.  They are all tied back to different back-end disks (not that vSphere should know/care about that) and I've written all zeros to the new LUN back-end disks to make sure that there's nothing weird on the disk confusing the iSCSI client.

I can post whatever logs/config/information needed.  Thanks in advance.

Regards,

Jon Heese

1 Solution

Accepted Solutions
jonheese
Contributor
Contributor
Jump to solution

For anyone who happens upon this same situation, my guess above turned out to be correct.  But in addition to renumbering the scsi_id of each of the LUNs, I also had to renumber the controller_tid field so the controllers on the second node would show up with unique ids.  Here's an example of the config stanza I used for each LUN:

<target iqn.2014-06.storage5.jonheese.local:datastore5>

    # backing device for the storage LUN

    backing-store /dev/drbd5

    # SCSI identifier for the storage LUN

    scsi_id IET     00050001

    # SCSI identifier for the controller LUN

    controller_tid 5

    # iSCSI initiator IP address(ed) allowed to connect

    initiator-address 192.168.24.0/24

</target>

So it was a tgtd configuration issue after all!  Thanks everyone for asking the questions that led me to the correct answer!

Regards,

Jon Heese

View solution in original post

7 Replies
jonheese
Contributor
Contributor
Jump to solution

Incidentally, I am able to connect to all 6 LUNs and see them as 6 independent disks from both Windows and Linux iSCSI initiator software.  So I'm pretty sure this is a vSphere thing, not a problem with my iSCSI target config.  Thanks.

Regards,

Jon Heese

Reply
0 Kudos
kumarlakshman_k
Enthusiast
Enthusiast
Jump to solution

Hello,

the above given are target names, the LUN names should start with naa.

what is your path selection policy ? this can be found by right click on lun and mange paths.

Try doing the discovery of targets again.

Thanks & Regards,

Lakshman,VCP550

FritzBrause
Enthusiast
Enthusiast
Jump to solution

VI client: Host -> Configuration -> Storage Adapters -> vmhba33 (for iSCSI)

In details column, check "Devices" for NAA IDs.

Also, there in Properties check for Dynamic and Static Discovery if everything is configured correctly.

In Web Client -> Host -> Manage -> Storage -> Storage Adapters and Storage Devices.

jonheese
Contributor
Contributor
Jump to solution

Thanks for the suggestions, guys.

However, I've looked in both the VI client and the Web client, and I don't see any naa IDs anywhere.  I've confirmed that all columns are being displayed, and I see naa IDs showing up for the local storage devices under "Identifier", but the iSCSI identifiers start with t10.

iscsi-devices.png

Also, I only see the four original (working) disks, each with its own controller device apparently, in that view.  When I click "Manage Paths" on datastore0 and datastore1, that's the only place I see the two new LUNs (datastore4 and datastore5):

datastore0.png

datastore1.png

You can see from those screen shots that I have "Round Robin" selected for my path selection algorithm, but I've tried changing that to each of the other options and it doesn't change the behavior at all.  Seems like path selection is the next step down the chain from the issue here-- i.e. this shouldn't be even considered as an additional path for this LUN, so how to select the paths is a moot point...

Also, I have removed and re-discovered these targets upwards of 10-20 times by now (after doing things like restarting the iscsi target daemon and changing target names, etc.), but it ends up with the same result each time.

Thanks again for any suggestions.

Regards,

Jon Heese

Reply
0 Kudos
jonheese
Contributor
Contributor
Jump to solution

Aha, I think I figured it out...

I noticed that the two new LUNs were the first two LUNs created on the second node of our storage cluster, and that they were appearing as multiple paths for the first two LUNs created on the first node of the storage cluster, in the same order.  That is, the first LUN of node 1 and the first LUN of node 2 were showing up as the same LUN, and the same goes for the second LUN of both nodes.

I discovered that the Identifier field in ESXi corresponds to the SCSI_id of the LUN, and by default, the Linux tgtd daemon assigns the SCSI ids starting with 00010001 and counting up, 00020001, 00030001, etc.  So the first two LUNs on both nodes have the same SCSI id, 00010001 and 00020002.  I believe that I can manually specify the SCSI ids to override this default behavior, and that will address this issue.

I'll try that change soon and post back with my results.  Thanks.

Regards,

Jon Heese

Reply
0 Kudos
jonheese
Contributor
Contributor
Jump to solution

For anyone who happens upon this same situation, my guess above turned out to be correct.  But in addition to renumbering the scsi_id of each of the LUNs, I also had to renumber the controller_tid field so the controllers on the second node would show up with unique ids.  Here's an example of the config stanza I used for each LUN:

<target iqn.2014-06.storage5.jonheese.local:datastore5>

    # backing device for the storage LUN

    backing-store /dev/drbd5

    # SCSI identifier for the storage LUN

    scsi_id IET     00050001

    # SCSI identifier for the controller LUN

    controller_tid 5

    # iSCSI initiator IP address(ed) allowed to connect

    initiator-address 192.168.24.0/24

</target>

So it was a tgtd configuration issue after all!  Thanks everyone for asking the questions that led me to the correct answer!

Regards,

Jon Heese

techno10
Contributor
Contributor
Jump to solution

Just want to thank you Jon as I stumbled upon this post and this was exactly the correct fix for me.. Talk about frustration of adding a new LUN only to have it not detected and on top of it when I tried to clear old ones to get a new one added only the new one would show up (so depending on which one I added first...)

Reply
0 Kudos