VMware Cloud Community
rterra
Contributor
Contributor

HELP! Lost datastore after applying ESX 3.5 Update 4

Hello all,

I've just updated one of my ESX servers (HP DL580 G5) from 3.5 Update 2 to 3.5 Update 4. I applied U4, along with approx 30 other updates using Update Manager. I recently updated VC to 2.5 U4, so Update Manager is at it's latest version. Upon restart of the ESX server after the applying the updates, ESX is stating that it "does not have persistent storage."

Under the Configuration tab, "storage" is empty. Under "Storage Adapters", I have two HBAs (LPe1150...according to VC). One HBA, DOES in fact see the LUNs. The other HBA, does not.

I've looked thru the Communities, but, couldn't find anything specific to my issue. I figured I'd start here before opening a ticket. I'm sill "online", but, my other two servers are being taxed very heavily due to the amount of machines they are handling.

I'm not sure if building it "from scratch" with a CD would help, or if I'm just missing something. I'd sleep sooo much better if I could figure out how to correct this.

I thought of uninstalling U4, but, I can't find anything about how to uninstall an update/patch.

Any help of suggestions would be greatly appreciated!!

Thanks,

- Bob

Reply
0 Kudos
17 Replies
depping
Leadership
Leadership

No uninstall for ESX... Did you try rescanning your HBA?

If you do an "fdisk -l" on the command line do you see the LUNs?

Duncan

VMware Communities User Moderator

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Reply
0 Kudos
rterra
Contributor
Contributor

Hi Duncan,

Thanks for the reply! Yes, I did rescan...multiple times. I just ran "fdisk -l" on the server. I also ran it on another for comparison. It does look like it's there, I see a few disks. fdisk does show the (3), 500GB Luns. (plus some others). I have a few items listed as "Disk /dev/sdf: 540GB....etc. Those comprise the VMFS storage.

Thanks again,

- Bob

Reply
0 Kudos
depping
Leadership
Leadership

Could you post the outcome of fdisk -l ?

Duncan

VMware Communities User Moderator

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Reply
0 Kudos
rterra
Contributor
Contributor

Here's the output of fdisk (if it works..not sure how to add an attachment). There are (3) 543 GB Luns. That's my storage. There's a 108GB disk..that can be ignored. The 147GB at the bottom is the local storage (ESX itself).

-


Disk /dev/sda: 2 MB, 2949120 bytes

1 heads, 6 sectors/track, 960 cylinders

Units = cylinders of 6 * 512 = 3072 bytes

Disk /dev/sda doesn't contain a valid partition table

Disk /dev/sdb: 2 MB, 2949120 bytes

1 heads, 6 sectors/track, 960 cylinders

Units = cylinders of 6 * 512 = 3072 bytes

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sdc: 543.0 GB, 543050956800 bytes

255 heads, 63 sectors/track, 66022 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 108.6 GB, 108610191360 bytes

255 heads, 63 sectors/track, 13204 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 543.0 GB, 543050956800 bytes

255 heads, 63 sectors/track, 66022 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sde doesn't contain a valid partition table

Disk /dev/sdf: 543.0 GB, 543050956800 bytes

255 heads, 63 sectors/track, 66022 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdf doesn't contain a valid partition table

Disk /dev/cciss/c0d0: 146.7 GB, 146778685440 bytes

255 heads, 63 sectors/track, 17844 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/cciss/c0d0p1 * 1 13 104391 83 Linux

/dev/cciss/c0d0p2 14 1287 10233405 83 Linux

/dev/cciss/c0d0p3 1288 1356 554242+ 82 Linux swap

/dev/cciss/c0d0p4 1357 17844 132439860 f Win95 Ext'd (LBA)

/dev/cciss/c0d0p5 1357 1610 2040223+ 83 Linux

/dev/cciss/c0d0p6 1611 1623 104391 fc Unknown

#

Reply
0 Kudos
depping
Leadership
Leadership

That's weird, looks good. any errors in the VMkernel or vmkwarning or even the messages file?

Duncan

VMware Communities User Moderator

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Reply
0 Kudos
whynotq
Commander
Commander

I'd like also to see an "esxcfg-mpath -l", i've had a couple of clients with odd issues like this, where ESX sees the disks but VC doesn't and that stops you using the LUNs. the fixes i have seen are rebuild of VC, re-intialisation of VC DB or removal/reboot/re-install of HBAs.

none of them are proven but they have worked in odd situations.

good luck.

Reply
0 Kudos
rterra
Contributor
Contributor

Hey Duncan,

I'm not 100% sure the best way to get the info from the logs. I did look at the ones you mentioned. My "log reading" skills have a lot to be desired. The vmkwarning log has a bunch of entires like the following:

Apr 19 04:19:14 esx101 vmkernel: 68:20:32:22.747 cpu3:1040)WARNING: Fil3: 1791: Failed to reserve volume f530 28 1 48a976c1 2192a3dc 1e0000be 5

4bdd00b 0 0 0 0 0 0 0

There's a bunch of stuff in vmkernel. There are many entires which reference storage "keywords" (ie. "EMC", "Symmetrix", etc) I'm not sure the best way to convey this info. I don't think pasting the entire log would be worthwhile. Let me know if there's something specific I can do.

I'm currently going throught the messages log to see if I can see anything.

Reply
0 Kudos
rterra
Contributor
Contributor

Thanks for your help. Here are the results of "esxcfg-mpath -l", :

Disk vmhba2:0:10 /dev/sdd (103578MB) has 1 paths and policy of Fixed

FC 25:0.0 10000000c978ef45<->5006048ad52ddb57 vmhba2:0:10 On active preferred

Disk vmhba2:0:0 /dev/sdb (0MB) has 1 paths and policy of Fixed

FC 25:0.0 10000000c978ef45<->5006048ad52ddb57 vmhba2:0:0 On active preferred

Disk vmhba2:0:16 /dev/sde (517893MB) has 1 paths and policy of Fixed

FC 25:0.0 10000000c978ef45<->5006048ad52ddb57 vmhba2:0:16 On active preferred

Disk vmhba2:0:17 /dev/sdf (517893MB) has 1 paths and policy of Fixed

FC 25:0.0 10000000c978ef45<->5006048ad52ddb57 vmhba2:0:17 On active preferred

Disk vmhba0:0:0 /dev/cciss/c0d0 (139979MB) has 1 paths and policy of Fixed

Local 2:0.0 vmhba0:0:0 On active preferred

Disk vmhba2:0:8 /dev/sdc (517893MB) has 1 paths and policy of Fixed

FC 25:0.0 10000000c978ef45<->5006048ad52ddb57 vmhba2:0:8 On active preferred

Disk vmhba1:0:0 /dev/sda (0MB) has 1 paths and policy of Fixed

FC 19:0.0 10000000c978ee13<->5006048ad52ddb58 vmhba1:0:0 On active preferred

Reply
0 Kudos
whynotq
Commander
Commander

This looks like you have 2 HBAs at VMHBA1 & 2.

VMHBA2 sees LUN8,10,16 & 17 and VMHBA1 sees the LUNZ at ID0

poilcy is fixed (correct for Symmetrix)

what can the Sym see? looks like a storage allocation problem, it would appear that the drivers are loaded but connectivity is missing. Zoning i'd suggest is ok from the fact that both HBAs see the target although only 1 each?

Reply
0 Kudos
pvries
Contributor
Contributor

Hi rterra, hello Duncan (how are you btw?)

I had the same problem as described above when upgrading to U4 from a native U3 installation. I have a Proliant ML110G5 with a ICH-9 SATA controller, I know it´s not on the HCL but it runs ESX35U3 just fine. What I have found out is the following.

Since U4 has added support for SATA ICH-9 and ICH-10 controllers it is now important that your SATA controller is in the list of supported SATA controllers, this actually will be a short list sinds the SATA support has just been added to the ESX installation. My controller isn´t in the list so I had to go back to U3 (fresh install). During this install I noticed that the installation gave me sda´s instead of the hda´s which U4 was giving me. Have a look at my partition table:

U3 partition table:

Disk /dev/sda: 500.1 GB, 500107862016 bytes

255 heads, 63 sectors/track, 60801 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/sda1 1 60801 488383968+ fb Unknown

Disk /dev/sdb: 250.0 GB, 250059350016 bytes

255 heads, 63 sectors/track, 30401 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/sdb1 * 1 13 104391 83 Linux

/dev/sdb2 14 650 5116702+ 83 Linux

/dev/sdb4 720 30401 238420665 5 Extended

/dev/sdb5 974 30401 236380352+ fb Unknown

/dev/sdb6 720 788 554179+ 82 Linux swap

/dev/sdb7 789 973 1485981 83 Linux

Partition table entries are not in disk order

Disk /dev/sdc: 500.1 GB, 500107862016 bytes

255 heads, 63 sectors/track, 60801 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/sdc1 1 60801 488383968+ fb Unknown

U4 partition table:

Disk /dev/hdc: 500.1 GB, 500107862016 bytes

255 heads, 63 sectors/track, 60801 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/hdc1 1 60801 488383968+ fb Unknown

Disk /dev/hda: 250.0 GB, 250059350016 bytes

255 heads, 63 sectors/track, 30401 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/hda1 * 1 13 104391 83 Linux

/dev/hda2 14 650 5116702+ 83 Linux

/dev/hda3 651 719 554242+ 82 Linux swap

/dev/hda4 720 30401 238420665 5 Extended

/dev/hda5 720 973 2040223+ 83 Linux

/dev/hda6 974 30401 236380352+ fb Unknown

Disk /dev/hdb: 500.1 GB, 500107862016 bytes

255 heads, 63 sectors/track, 60801 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/hdb1 1 60801 488383968+ fb Unknown

So in U4 the Linux part, disk and partitioning, is showing the same information, only the ESX part is missing native SATA drivers for my controller so no storage adapters.

I assume that because U3 is still using something like a SCSI emulation, the storage adapters are shown (as SCSI controllers). The added support of native SATA in U4 (and removing the sda´s from my system) has resulted in not showing the storage adapters in my case.

Regards,

Peter.

Reply
0 Kudos
rterra
Contributor
Contributor

Thanks to all for their suggestions. I'll have a closer look when I get into the office. I'll also have the storage folks check out their end. I find it hard to believe that the issue is there, if only for the fact that everything was working fine before applying U4. I suppose opening a ticket with VMware isn't a bad idea either.

Thanks,

- Bob

Reply
0 Kudos
rterra
Contributor
Contributor

I've partially managed to correct my missing HBA. It somehow lost it's masking or something. Currently, both HBAs can see the LUNs. Unfortunately, the host still does not recognize the datastore. Any suggestions? I've rescanned the HBAs a few times. I'm at a loss.

Suggestions, HELP, tips welcome!

Thanks,

- Bob

Reply
0 Kudos
ephillipsme
Enthusiast
Enthusiast

Yes,

I had a similar issue after applying some updates, the LUNs could be scanned but the datastore was unavailable. Under Advanced settings on the ESX host, look at the LVM configurations, Not what the settings are and then change the LVM.EnableResignature LVM.DisallowSnapshotLun to the oposite settings 0 or 1 and try a rescan.The datastore should appear again and change the settings back to the original.

Ernie

~Ernie
Reply
0 Kudos
Drac346
Contributor
Contributor

I have a similar problem where we are using a SGI array which is not supported by ESX any longer. One of the distingushing issues is that the ESX servers can see the LUN's but they do not see the storage. This issue occurs when ESX doesn't know how to talk to the disk array and is incapable of switching to the correct HBA path to talk to the active storage controller with the correct LUN. The fix (for us) was to update the perferred path in the ESX host to one of the other available paths. This may not apply to you if you are on a supported disk array, but you might check to make sure that the current 'preferred' path is talking to the correct controller.

Reply
0 Kudos
rterra
Contributor
Contributor

Here's a quick update. My problem I believe is from the original set up. This wasn't done by myself. Basically, the original storage (two 500 GB LUNs) was configured to use "extents". Well, for the very reason, not using extents is recommned, is my apparent problem. Somehow one of the extents was lost or corrupt. I have another server with the same symptoms. It wasn't a critical server, so, I hadn't had time to investigate. Hindsight being what it is, it's the same issue. My two remaing "good" servers are fine...as long as the HBAs do NOT get rescanned or the server restarts. If that happens, well, I'd rather not think about it. I have had VMware take a look and this it what appears to be the case.

The scenario for tomorrow is to be presented with three new LUNs, create/configure the LUNs the "proper" way, sVMotion the machines to the datastore.

My small issue is I've never created a datastore on a SAN from "scratch". I think I know, but I'm not 100% sure. Hence, I'll be reading alot tonight. I'll post back later with an update just in case anyone is interested.

Thanks again for everyone's help!

- Bob

Reply
0 Kudos
SuryaVMware
Expert
Expert

depping,

I don't see any partition on any of the 543.0 GB LUNs. what are refering to looks good here? I just trying to figure out if I am missing anything.

-Surya

Reply
0 Kudos
Narkis
Enthusiast
Enthusiast

I had similar issue none of the troubleshooting methods was worked, so I DID POWER RESET through ILO on my HP DL 580 G5 server. The data stores are visibled now. We know its not a correct way but the problem was fixed. Cheers!

Reply
0 Kudos