VMware Cloud Community
TiJa
Enthusiast
Enthusiast
Jump to solution

Matching VMFS volumes between a ESX 3.5 host and a VCB proxy

Hi all,

I am trying to find a way to match the SAN LUNs that have been presented to a VCB proxy server in our VI 3.5 environment with the LUN ID's/VMFS ID's I can see in the service console of our ESX 3.5U2 hosts. However, I see some discrepancies that I cannot explain with my (limited) knowledge of VMFS.

On the ESX 3.5 hosts, I log onto the service console and issue the following commands to obtain the VMFS ID's:

ls -al /vmfs/volumes
esxcfg-vmhbadevs -m

The results are (obviously) consistent and the ID's I obtain are the following:

(A) vmhba1:4:1:1 /dev/sdc1 483cf914-29b60dc5-dbfd-001cc497e630
(B) vmhba1:0:2:1 /dev/sdb1 48858dc4-f4e218d1-d3a8-001cc497e630

I added the letters for later reference. The next step is to go to our VCB proxy server which runs a Windows 2003 Enterprise Edition. This is a physical machine that has the same LUN's & some additional LUN's from other ESX servers presented and has the latest VCB 1.5 installed. Here, I use the vcbSanDbg.exe command to discover the VMFS volumes that are presented to that host (the exact command is vcbSanDbg.exe | findstr "ID" ). The output is the following:

(1) ID: LVID:48761b97-dacedf9f-ebb9-0017085d0f91/48761b97-a4f562bd-6875-0017085d0f91/1 Name: 48761b97-a4f562bd-6875-0017085d
(2) ID: LVID:48761bc6-7b4afa63-97d9-0017085d0f91/48761bc5-3f508baa-2f5d-0017085d0f91/1 Name: 48761bc5-3f508baa-2f5d-0017085d
(3) ID: LVID:483cf913-458f9fa5-a749-001cc497e630/483cf913-05b4f526-45b5-001cc497e630/1 Name: 483cf913-05b4f526-45b5-001cc497
(4) ID: LVID:479da7b6-877867e9-dd06-001cc497e630/479da7ac-55fe7dfe-378c-001cc497e630/1 Name: 479da7ac-55fe7dfe-378c-001cc497
(5) ID: LVID:477c2b4a-969e01e0-8d49-001cc495fb46/477c2b4a-7db36616-30ea-001cc495fb46/1 Name: 477c2b4a-7db36616-30ea-001cc495
(6) ID: LVID:48843bec-28cc17a4-ca9e-001cc495fb46/48843bec-154cf784-871a-001cc495fb46/1 Name: 48843bec-154cf784-871a-001cc495

The problem is linking the two outputs; as you can see, the VMFS ID's are not entirely the same, but there is some resemblance:

(A) vmhba1:4:1:1 483cf914-29b60dc5-dbfd-001cc497e630 is more or less (2) LVID:xyz / 483cf913-05b4f526-45b5-001cc497e630 / 1

(B) vmhba1:4:2:1 479da7c1-4494cd90-d327-001cc497e630 is more or less (4) LVID: xyz / 479da7ac-55fe7dfe-378c-001cc497e630 / 1

Is this just a pure coincidence? Am I missing something, and in that case, is there some other way of retrieving the VMFS ID's of LUNs presented to a Windows host?

Thanks for any help!

Tim

0 Kudos
1 Solution

Accepted Solutions
snapper
Enthusiast
Enthusiast
Jump to solution

I could be mistaken, but this is the process I use, that has worked reliably between our VCB host and ESX server.

Step 1. Find the correct LUN mappings on ESX hosts by running the following command:

esxcfg-mpath -lv | grep ^Disk | awk '{print $3,$5,$2}' | while read line; do echo $; echo $; done

This will display a list of UID and corresponding devices (which you can get the LUN from; ie vmhba1:0:5 is LUN 5)

The important part is the 32 digit number. Use this to locate the corresponding device from the output of vcbsandbg.exe

ie, vcbsandbg.exe > c:\LUNs.log

Open LUNs.log and search for the above 32 digit number. If the LUN is present, it should appear in the form of:

Found SCSI Device: NAA:<number>

the following line should display the number of paths it is present on. The line after will display the CTL which will display the LUN.

You now have a way of matching LUNs and disk presentations between ESX and VCB. This will include RDMs as well.

Hope this helps resolve the issue.

Cheers,

SP

Don't forget to award points where appropriate 🙂

View solution in original post

0 Kudos
12 Replies
kjb007
Immortal
Immortal
Jump to solution

Are you sure you've presented to vcb the same LUN ID's as the ESX host sees?

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
TiJa
Enthusiast
Enthusiast
Jump to solution

Yes, I explicitly asked our SAN team to verify it as I do not have access to do that myself. I can take VCB backups of machines located on the datastores I mentioned so I am pretty sure the LUN presentation is ok.

I must emphasize that there is no problem and that everything is working correctly. It is just that I want to find out how VCB on Windows detects the correct VMFS volume to back up from (FYI, it is not based on any SAN/WWPN based detection because apparently our SAN team presents the same LUN to the ESX host and our VCB proxy using different WWPN's). From vcbSanDbg.exe I cannot find anything more than a few digits (no full UUIDs) leading back to the VMFS ID's I see on the ESX hosts.

0 Kudos
kjb007
Immortal
Immortal
Jump to solution

I have seen these configs fail when the Lun ID's are not the same, so I would venture that this is the first check. If this is correct, then the process continues.

Not much more help to you, but the Lun IDs matching is the very start of the entire process.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
TiJa
Enthusiast
Enthusiast
Jump to solution

Any hints on how I can check how the LUN ID's are the same? Because that is precisely the problem I am trying to solve: identifying from the OS perspective which VMFS partitions are presented :). Our VCB host has over 10 LUN's presented of which I need to find 2 which belong to a specific ESX host.

0 Kudos
seniord
Enthusiast
Enthusiast
Jump to solution

Try running the VCB Diagnostic Tool; . You can also see the LUN IDs in Windows Disk Managment.

Message was edited by: seniord

Sorry - read your message too quickly, notice you have already used this tool.

0 Kudos
TiJa
Enthusiast
Enthusiast
Jump to solution

I noticed the VCB Diagnostic Tool that you mentioned is actually an older version of the vcbSanDbg.exe tool that I have already been using. The problem that started this thread is in fact that in its output, the VMFS ID's that are shown are not the same as the VMFS ID's I see on the ESX host. Yet, vcbMounter.exe can succesfully perform the backup. I interpret this as having one of the two following meanings:

  • either the vcbSanDbg.exe tool does not output the "real" VMFS ID but rather a partition or LUN ID that VMFS also assigns.

  • or, vcbMounter.exe uses another mechanism to discover the partition it should access to perform a virtual machine backup; in that case, I am interested in knowing how the partition/volume/lun identification is performed on the Windows side.

This will allow me to identify the other 7 LUN's that I am seeing on my VCB proxy and to discover which ESX host they belong to, which seems to me like a common problem in a big environment, many hosts, many different teams working on the virtual infrastructure, no?

Thanks for any feedback!

Tim

0 Kudos
Steven_Rodenbur
Enthusiast
Enthusiast
Jump to solution

I'd say the cause of the discrepancy's is that the ESX servers and the VCB Windows server have a completely different take on thing as soon as you get higher than the fundamental building-blocks of a FC or iSCSI SAN.

They "happen" to be able to see the same LUN's because the SAN administrator has set the FC zoning to enable that (which of course, is the correct way) for example.

However, the Windows VCB box and the ESX boxes see, fundamentaly speaking, only the same WWN's (or in iSCSI land, the IQN's).

That is all they have incommon. How the ESX and Windows boxes present those LUN's to their user-interfaces is a whole different matter.

So all you have to go on is WWN's or IQN's as long as the VCB box does not ask anyone in the ESX/VI world what VMFS volume name an ESX box uses (the long names under /vmfs/volumes ). The VCB box would need to know how the ESX boxes translate LUN's from the lowest levels to the name you see in the Service Console under /vmfs/volumes.

For that the VCB box should ask that question to VC which in turn, askes that question to ESX and so on.

dconvery
Champion
Champion
Jump to solution

I belive that part of the "LVID" in Windows is based on the switch port WWN/ switch node WWN depending on zoning method.

Dave

Dave Convery, VCDX-DCV #20 ** http://www.tech-tap.com ** http://twitter.com/dconvery ** "Careful. We don't want to learn from this." -Bill Watterson, "Calvin and Hobbes"
0 Kudos
TiJa
Enthusiast
Enthusiast
Jump to solution

Steven Rodenburg wrote:

So all you have to go on is WWN's or IQN's as long as the VCB box does not ask anyone in the ESX/VI world what VMFS volume name an ESX box uses (the long names under /vmfs/volumes ). The VCB box would need to know how the ESX boxes translate LUN's from the lowest levels to the name you see in the Service Console under /vmfs/volumes.

I am not sure this is correct: the VMFS ID is really a unique identifier that is created and stored on the partition that contains the filesystem. The long alphanumeric code represents this ID and each ESX host (or VMFS-capable interpreter) recognizes it as the same.

Initially, I thought it was sufficient to match the WWPN's of the LUN's. Apparently, our SAN team has presented the same LUN using different WWPN's to the ESX and VCB box. That seems to rule out that the WWPN is the (only) identifier used to recognize a partition, which leads us to the VMFS ID again. If you run vcbSanDbg.exe, you can also see the first 100k of each LUN is read and when a VMFS volume is found, the corresponding identifiers are displayed on screen -- again strenghtening my suspicion that the VMFS ID is in fact what we are looking for.

I totally agree with your statement that a Virtual Center or ESX host itself must be contacted in order to retrieve the VMFS ID (long name) of the storage that hosts the VM.

I belive that part of the "LVID" in Windows is based on the switch port WWN/ switch node WWN depending on zoning method.

That would explain the large similarity and yet slight differences in the LVID's that I am seeing: the first bytes of the WWPN are the SAN Manufacturer/Type ID and only the last digits indicate the WWN port... They are only different because in my case the LUNs are presented under different WWPN's (and of course the fact that three different SAN manufacturers are presented to our ESX hosts which explains some of the "larger" differences between the LVID's).

So unless someone can say with certainty that indeed the VMFS ID's are not used for unique identification... the question remains open why they are different on the VCB box and ESX host (or perhaps vcbSanDbg.exe doesn't show the VMFS ID??). Any further suggestions and comments are highly appreciated!

0 Kudos
snapper
Enthusiast
Enthusiast
Jump to solution

I could be mistaken, but this is the process I use, that has worked reliably between our VCB host and ESX server.

Step 1. Find the correct LUN mappings on ESX hosts by running the following command:

esxcfg-mpath -lv | grep ^Disk | awk '{print $3,$5,$2}' | while read line; do echo $; echo $; done

This will display a list of UID and corresponding devices (which you can get the LUN from; ie vmhba1:0:5 is LUN 5)

The important part is the 32 digit number. Use this to locate the corresponding device from the output of vcbsandbg.exe

ie, vcbsandbg.exe > c:\LUNs.log

Open LUNs.log and search for the above 32 digit number. If the LUN is present, it should appear in the form of:

Found SCSI Device: NAA:<number>

the following line should display the number of paths it is present on. The line after will display the CTL which will display the LUN.

You now have a way of matching LUNs and disk presentations between ESX and VCB. This will include RDMs as well.

Hope this helps resolve the issue.

Cheers,

SP

Don't forget to award points where appropriate 🙂
0 Kudos
TiJa
Enthusiast
Enthusiast
Jump to solution

snapper,

Thanks, exactly what I was looking for! For completenes' sake, I'll add that in the vcbSanDbg.exe output, the UID's that you get from the ESX host are followed by 12 other digits that (in my case) I ignored.

Edit: I have shortened the command syntax to:

esxcfg-mpath -lv | grep ^Disk | grep -v vmhba0 | awk '{print $3,$5,$2}' | cut -b15-

This also leaves out local storage and is a bit easier on the output :). Furthermore, the 12 "extra digits" that I was talking about that you see on the vcbSanDbg.exe output are also no longer stripped in this command syntax.

0 Kudos
snapper
Enthusiast
Enthusiast
Jump to solution

Cool - glad it helped. I actually use a slight more lengthy process, one that links all the important details together.

ie, LUN, Controller,NAA,size,free space, and the VMFS friendly name or RDM.

I use a modified version of the ESX health check () to generate this on a nightly cron and ftp the html output to an internal website.

This way external people (ie, the SAN guys) can see important LUN information, verify their presentations (using the NAA / vdisk UID) etc and see our VMFS friendly name to link it all together.Oh, and for checking the VCB LUNs as well.

Script is now attached. Simply make it executable by chmod +x <filename> and execute ./getluns.sh

Should return something similar to:

LUN,VMFS_name,Device,HBA,Target,DiskID,Size,Used,Free,Used%

5,DDC1_PAT_HDS_05,/dev/sdf,vmhba1,0,60560768023e01e8f80000000000002d,449G,362G,87G,80%

ps. The shortened version you posted above can be modified to include the full disk ID:

esxcfg-mpath -lv | grep ^Disk | awk '{print $3,$5,$2}' | cut -c15-46,59-

Cheers,

SP

Don't forget to award points where appropriate Smiley Happy

Message was edited by: snapper

Don't forget to award points where appropriate 🙂
0 Kudos