Solved: Re: Datastore not show and/or displaying as a "SNA...

brucecmc · ‎05-31-2008

Hi folks,

I'm running esx 3.0.2 on 2 Hosts and 3.5 (1 host 3.5 after upgrade). ESX1, ESX2, ESX3 in a 3 host HA cluster and VC2.5

I'm not sure what happened, but ESX3 Host sees the datastores, but ESX1 and 2 only see datastores as a "snap"...We didnt create any snaps and no changes were made to the backend storage...

When I use the Datastore View in VC, it shows an association between ESX3 and the Datastores (no snaps). However, ESX1 and 2 have no relationship with the datastores and can ONLY see the snaps.

when you browse the "snap-xxxxxx-datastore", there are NO vmdk files in them...

I'm trying to figure out how to reestablish the ability of ESX1 and 2 to see the original datastores and remove the relationship with the snap datastores...that should correct my problem...Just dont know how to do it..

Trying to figure out the best way to recover

thanks in advance.

Bruce

mike_laspina · ‎05-31-2008

The items you posted are not the correct element to go by. I apologize I gave you the incorrect location to compare. The location you are looking in is only reliable for the info when it is working correctly.

The correct location is for checking what is currently detected on a storage scan is

/vmfs/devices/disks

Here is an example of a working iscsi store. This is LUN 1 of some shared storage and should be the same LUN across all hosts.

vml.010000000020202020534f4c415249

vml.010000000020202020534f4c415249:1

You need to make sure that the LUN vml's are currently seen on the same LUN ID's across all hosts. If not you must try to correct it at the Storage Server side. If it is not possible to correct there then you would have to allow a volume resignature and then rescan all stores on all hosts. This is not good because it you will have to reregister every VM if thats the corrective action.

Do you have VM support, they will be able to deal with this more reliably than I can since I am not familiar with the Storage Server you have.

http://blog.laspina.ca/ vExpert 2009

View solution in original post

mike_laspina · ‎05-31-2008

Hi,

This happens when a vmfs volume appears on a different vhba LUN ID with the same uuid.

Have a look at the presentation issue #2 as it will explain it in detail.

http://blog.laspina.ca/ vExpert 2009

brucecmc · ‎05-31-2008

thanks for the post Mike...But, the storage I am using is a Symmetrix 8830 Array. The mapping of the FA to the storage allows a single LUN # per FA (Fiber Adapter): meaning, I can only use LUN 001 on FA 3aB one time (up to 256 LUN #'s).

Secondly, I only have a single HBA connected presently. So, for example, on FA3aB, the mapping would allow symmetrix device 1c0 to be seen down that FA as lun 001. So, i dont think it is an issue with the LUN numbering.

However, I do have 3 LUNs. all are mapped to the same FA3aB, but of course, one symmetrix device is LUN001, the next is LUN002 and then LUN003...which has been working for almost a year...

mike_laspina · ‎05-31-2008

It does not matter what storage you are using. When the ESX server sees a previously configured UUID to LUN ID assignment on a new LUN ID other that the one it originally has configured it will treat the LUN ID as a snapshot.

Have a look at the /etc/vmware/esx.conf file it holds the mappings of UUIDs to LUN IDs

http://blog.laspina.ca/ vExpert 2009

brucecmc · ‎05-31-2008

thanks for the pointers mike...Yes, the LUNs on all 3 ESX hosts are not the same, I see that....But, how do you correct this? do you merely VI the ESX.conf and change the LUN to the one that works???...

ESX1

/storage/diagPart/lun = "vml.02000000006001c230c36795000e75a72773ec7668504552432035"

/storage/diagPart/partition = "5"

/system/uuid = "483ef231-bfb3-31c3-20fe-001c23bb23fc"

ESX2

/storage/diagPart/lun = "vml.02000000006001c230c36928000e7585f27137450e504552432035"

/storage/diagPart/partition = "10"

/system/uuid = "470bc04c-9128-65c6-bcfb-001c23bb23ed"

ESX3

/storage/diagPart/lun = "vml.02000000006001c230c36924000e75aa53739ec4a2504552432035"

/storage/diagPart/partition = "10"

/system/uuid = "470bc063-166b-7765-c206-001c23bb23e3"

mike_laspina · ‎05-31-2008

The items you posted are not the correct element to go by. I apologize I gave you the incorrect location to compare. The location you are looking in is only reliable for the info when it is working correctly.

The correct location is for checking what is currently detected on a storage scan is

/vmfs/devices/disks

Here is an example of a working iscsi store. This is LUN 1 of some shared storage and should be the same LUN across all hosts.

vml.010000000020202020534f4c415249

vml.010000000020202020534f4c415249:1

You need to make sure that the LUN vml's are currently seen on the same LUN ID's across all hosts. If not you must try to correct it at the Storage Server side. If it is not possible to correct there then you would have to allow a volume resignature and then rescan all stores on all hosts. This is not good because it you will have to reregister every VM if thats the corrective action.

Do you have VM support, they will be able to deal with this more reliably than I can since I am not familiar with the Storage Server you have.

http://blog.laspina.ca/ vExpert 2009

Jeff11173 · ‎05-31-2008

Guys,

I am not 100% sure in the Symm side of EMC but I had the same issue with EMC Clariion, It is resolved in the Clariion world by making sure the LUNS are presented in the same order to each ESX server. By

1. Placing them in the storage group in the same order for each server if you have multiple storage groups. -or-

2. Use a single storage group and put all ESX server is it. -or-

3. Change the HLU for each LUN in EACH storage group, that is done by clicking in the white space to the far right of the LUN ID in the storage group.

Hope that helps a little!

brucecmc · ‎06-01-2008

thanks again...well, the support guys advised us to rename the SNAP datastores, which we ultimately did. Apparently, that was the only fix in the support tech knew. We actually ran into a problem with that, because thought there was no ESXHOST associated with the Datastores, it was showing VM's associated with them. So, we couldnt blow away the old datastore and then rename the SNAP datastores. It was a bit cumbersome to rectify...I thought maybe somebody out there might have had an easier fix...

Again, i was lucky...I only had 10 VM's, but if you had an environment that had 30 or 40 or more, that could be really ugly...and luckily, we could shut things down and work on it...

Thanks for the information and the continued help..

brucecmc · ‎06-01-2008

thanks Jeff...appreciate the post...

symmetrix is considerably different than clariion...Dont have the same type of groups and presentation methods...But, I have ran into this on Clariion also...Not sure what you mean by placing them in the storage group in the "same order"?? I use a single storage group with multiple ESXHOSTS having visibility to all storage (to support the HA Cluster).

Bruce

Jeff11173 · ‎06-01-2008

a lot of people ( I think it used to be best practice from EMC and Vmware) created a storage group for each ESX server and would place the Shared LUNS in each storage group. The order that the LUNS are presented are actually the sequence that ESX will assign it's own LUN id. if say you put LUN 10,20,30 in that order in storage group "ESX1" and then add them in storage group "ESX2" LUN30,20,10. Then the LUN 10 and 30 can have the snap issue. I know it is odd but simply the order they are clicked can determine that.

The only reason I can see using a storage group for each ESX server is Boot from SAN requires it. But tons of people seem to like boot from SAN for some reason

All

Datastore not show and/or displaying as a "SNAP"