VMware Cloud Community
slebbon
Contributor
Contributor

iSCSI "path IDs" differing between hosts OK?

Question1: Are the Path ID values supposed to match on all ESX Hosts in a Datacenter?

Question2: Are there any special steps to follow when removing an iSCSI target from VMware?

Details:

We have ESX 3.5 U2 running on 2 hosts, connected to an iSCSI SAN.

The hosts had 4 VMFS Storage Volumes setup on the SAN already. They were all added at the same time, in the same order, and had sequentially numbered "path IDs" of vmhba32:0:0:0 vmhba32:1:0:0 vmhba32:2:0:0 and vmhba32:3:0:0.

Recently we added a few more volumes to the SAN for testing purposes. I only did a 'rescan' of the iSCSI Targets on 1 of the two host machines. When done with testing, I removed the iSCSI Targets from the SAN. When I did a rescan on that host, the additional storage targets disappeared (as expected).

However, when I created a new Storage Volume today on the SAN, and did the 'rescan' on the first host, it assigned it an ID of "vmhba32:9:0:0" so it 'skipped over' the IDs which had previously been created. That doesn't worry me too much (although i wonder if anything needs to be 'cleaned up' on the host?). The issue is after I configured and formatted the VMFS volume on that host, I then went to the second host and did a 'rescan' on it's iSCSI connections. This automatically found and added the iSCSI target AND the newly created VMFS storage volume, however it assigned it a "Path ID" of "vmhba32:6:0:0" which doesn't match the ID of "9" on the first host! Is this a problem? I moved a test VM on the storage and it seems to run OK on both hosts, but in Virtual Center the "Path" behaves oddly: When switching between the hosts' "Storage configuration" sections, the "Device" column for that new volume always shows the ID from the "first" host I accessed after opening Virtual Center. ie: If I open Virtual Center and access the second host first it always shows a device of "vmhba32:6:0:1" for both hosts reguardless which host I look at. Conversly if I access the first host first, it always shows "vmhba32:9:0:1" as the device ID no matter which host I look at afterwards. Also, whichever host is showing the wrong ID in the devices column doesn't let me see the valid properties of the Storage Volume...(The Extents and Extent Device sections of the properties window are 'empty' on the second host machine to be accessed from Virtual Center.

Thank you very much,

-Shawn Lebbon

Rotork IT

Tags (3)
0 Kudos
2 Replies
BUGCHK
Commander
Commander

Yes, they can be different, because the target ID cannot be defined by the storage array. The iSCSI software initiator does a persistent binding from the iSCSI target name to a SCSI ID. If you do things in different order on the servers, the numbers can be different. You can see (and if you like, modify) the mapping in the file:

/var/lib/iscsi/vmkbindings

slebbon
Contributor
Contributor

I opened a Support Ticket with Vmware on this. Here are their replies:

In a vmhba canonical path, the first number is the HBA number, second is the enumerated target, the third is the LUN ID provided and the fourth is the partition number.

Generally, in a fiber channel network the target is the controller. If the SAN has two controllers or storage processors, the second number will be 0 and 1. These numbers are assigned by the ESX host itself. In the case of iSCSI, it depends on the SAN. Some SANs might present each LUN under a different target, which seems to be true in your case. ESX provides the enumeration of these targets and these numbers do not match anything on the SAN. They do not have to match the numbers on other hosts either.

The first number, which is 32 in your case is also assigned by ESX. It might be 33 or 40 on other hosts depending on if the cdrom, usb key etc were detected. The hba and target numbers are not important and do not have to be any specific number.

The third number is the LUN ID. This number is provided by the SAN and ESX will show it as is. In your case, this is all zero.

The fourth one is the partition number which by default is 1 as the first partition.

It appears that you've created the LUN serially, 1 through 4. ESX gave them the same numbers. The fifth LUN would be 5 and sixth one would be 6. In fact, these numbers might change on reboot. I see no issue here and the setup looks fine. The first two numbers, the hba and target do not have to match anything else.

Esx takes the uniqueness of a LUN from its WWN number, which must remain the same and must be the same for all hosts accessing it, even if they're outside the datacenter. If this WWN changes, access to the LUN is blocked and the storage does not show up under 'storage'. In most SANs, to make sure the LUNs show up with the same WWNs to all hosts, you would use a container (host group, storage group or sometimes called access group) to contain the HBA WWPNs of all hosts and present the LUN to this container. Since you can see the datastores fine on all hosts, I think they show up with the same WWNs.

As for removing LUNs, make sure there are no VMs or disks running on them, click on 'remove' under 'storage', remove the datastore and rescan on all other hosts. You can also simply unpresent the LUN from the SAN if nothing is running on it and rescan all hosts. There's nothing else to do to remove a LUN.

I think your setup is just fine here.

Best regards,

Ghazan Haider

Vmware Support Engineer

Global Support Services

Vmware Inc.

-


Shawn wrote:

---

I just noticed as well that in the storage configuration section, the 'free space' of the volume with 'different' device IDs is being reported as still 'empty' even though I've transferred a number of VMs and data to it. Refreshing the view, doesn't seem to change this status. I am attaching a screenshot of this.

---

VMware Reply:

Virtual Center has a bit of a delay regarding storage updates.

Try right-clicking the VMStorage3 and select refresh. Waiting for a bit works too, although I'm not sure what the refresh frequency is.

-


Finally I asked a few more followup questions:

1) I already removed the 'test' volumes from the SAN. The test disks were used in vmware as RDM disks. By 'Removing them' I did the following:

a) Removed the RDM disk from the test VM, selecting the 'delete from disk' option.

b) Going to the SAN and removing access to the volume(s) from the IP addresses of the VMware machines.

c) Deleting the test volumes from the SAN.

I now notice some other behavior I didn't when I created this ticket. On the SAN error log, it logs an error each time I rescan the iSCSI initiator of the VMware Host. The log says that the requested volume (with the ID and name of the test volumes I deleted) couldn't be found, and the requested connection could not be restored. This has persisted for each refresh, so it does seem like vmware hasn't yet 'forgotten' about those volumes...although I can't see any reference to them in the GUI (other than the skipping over of their former devices IDs when adding additional LUNs).

A) I'm assuming you're running a rescan through the GUI. You have removed the volume properly, and you might see the host checking for that volume on the first rescan after you've removed it, but not subsequent ones.

I'll have to look at the host's logs, if the host keeps trying to access a LUN on every rescan, that does not exist.

2) I was able to get the free space to update by right-clicking on the volume itself and choosing refresh. The 'refresh' at the top of the window didn't do it, and it had been at least 5 hours now since the space changed... But at least that's sorted out now. Thank you.

A) You're welcome.

3) There still seems to be a bug in Virtual Center though in displaying the volumes when the ID's differ. I still can't get 'properties' on a volume from the Host Storage Configuration section on the 'second' host I try with. Always the first host I click on in a 'session' of VC client is the only one I can view properties of. This is very repeatable (for me with VMStorage3). The other 4 VM volumes work just fine on both hosts.

A) This makes me wonder if both hosts were rescanned when the LUN was removed. The properties of a storage would not show up when the storage subsystem is busy with a task.

Please upload logs from the host where you removed the LUN (and it still queries for the LUN on the SAN) and from this host.

-


In response to #3, we ended up opening another support ticket with the Virtual Center People.

It turns out the problem with Virtual Center not displaying the properties when the host ID's differ IS a bug in the Virtual Center Client...

0 Kudos