VMware Cloud Community
GodTheHamster
Enthusiast
Enthusiast

ACK, HELP! ESXi 4.1 can not see VMFS partitions.

Heres what happened. internal raid 5 failed, two drives down. reboot, server comes back with an error stating that the drives previously failed but now appear operational, some data may be lost, to accept this loss press F2, to coninue with logical drives offline press F1. I hit F2, machine reboots and loops, same thing, try it again, this time it starts loading ESXi (4.0.1) fails at boot image, corrupt. DOH!

I get a 4.1 u1 image, burn it, use it to repair. during the repair it said i will have a fully functional however some VMFS partitions may not be immediatly available. So they are infact not there. Ive tried multiple methods people have listed as working solutions, nothing has fixed it. I havnt had time to place a call into VMware yet, but that may be what needs to happen.

Any thoughts, opinions? are the VMFS completly gone as i lost two drives? nothing (at least nothing that i know of) has been over written, only repaired ESXi, albeit with a newer version. did i screw my self by doing this?

Thanks

0 Kudos
17 Replies
weinstein5
Immortal
Immortal

With RAID losing two drives does mean the volume is gone - in my opinion you are out of luck - 

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
DSTAVERT
Immortal
Immortal

It may be possible to recover but it depends on how much damage may have occured. VMware support may be able to help. Recover from backup might be quicker.

-- David -- VMware Communities Moderator
0 Kudos
GodTheHamster
Enthusiast
Enthusiast

....thats one of the problems, backup solution was not in place. quite possibly the worst day in a while, started with waking up to pump being out on Koi pond, lost 4 fish. then server takes a dive. /facepalm

0 Kudos
DSTAVERT
Immortal
Immortal

Companies like Kroll Ontrack http://www.krollontrack.com/ can often times recover things. Very expensive though.

I am inclined to agree with David. Count it gone.

-- David -- VMware Communities Moderator
0 Kudos
GodTheHamster
Enthusiast
Enthusiast

Thats kinda what i was thinkin.....oh well didnt lose anything too major, just a couple of XP boxes, a minecraft server and oh yeah, two years worth of emails and docs that never got backed up on the physical box and the drive was reallocated when we VMed it. hah. doh, ok we'll now i know, next time run at least weekly backups.

0 Kudos
DSTAVERT
Immortal
Immortal

For future reference.

GhettoVCB backup script http://communities.vmware.com/docs/DOC-8760

ESXi Control http://blog.peacon.co.uk/wiki/Esxi-control.pl

Hope the rest of your day gets better.

-- David -- VMware Communities Moderator
0 Kudos
a_p_
Leadership
Leadership

So they are infact not there. Ive tried multiple methods people have listed as working solutions, nothing has fixed it.

Although I agree with the others that data loss is most likely with the two lost disks, one question though. Do you see the VMFS partition when running fdisk -lu on the console?

André

0 Kudos
GodTheHamster
Enthusiast
Enthusiast

~ # fdisk -ul

Disk /dev/disks/mpx.vmhba1:C0:T0:L0: 441.2 GB, 441241845760 bytes
64 heads, 32 sectors/track, 420801 cylinders, total 861800480 sectors
Units = sectors of 1 * 512 = 512 bytes

                          Device Boot      Start         End      Blocks  Id Sys                     tem
/dev/disks/mpx.vmhba1:C0:T0:L0p1          8192   1843199    917504    5  Extende                     d
/dev/disks/mpx.vmhba1:C0:T0:L0p4   *        32      8191      4080    4  FAT16 <                     32M
/dev/disks/mpx.vmhba1:C0:T0:L0p5          8224    520191    255984    6  FAT16
/dev/disks/mpx.vmhba1:C0:T0:L0p6        520224   1032191    255984    6  FAT16
/dev/disks/mpx.vmhba1:C0:T0:L0p7       1032224   1257471    112624   fc  VMKcore
/dev/disks/mpx.vmhba1:C0:T0:L0p8       1257504   1843199    292848    6  FAT16

Partition table entries are not in disk order

I dont believe they are there, no. 😕

Edit: Also, the disks where never replaced. the host was simply rebooted and the hardware diag said they appeared operational, then asked me if continuing with the data loss was acceptable. I figured one may have gone down and taken the one next to it with it, so only one was orginally comprimised. the only data that should have been lost at that point was the write cache. oh well im not sure what caused the failure, and not sure why it says they are working again, i think i need a SAN and more disks and a better (exsisting) backup solution and that will fix this lil mess.

0 Kudos
a_p_
Leadership
Leadership

From what I can see, the VMFS partition is missing. However that does not necessarily mean the data is lost. If we/you are able to restore the partition table (recreate the VMFS partition) with the correct values, there might be a chance to access the data.

It's up to you to decide to call VMware support, where they have engineers and the detailed knowledge or try it by your own. In this case I will fire up my test system to see whether I can find out what exactly is missing. However I can't promise you anything.

Btw. which version/buidl of ESXi did you run before this happened and which build did you use to run the repair?

André

0 Kudos
GodTheHamster
Enthusiast
Enthusiast

I installed 3.5 initially, updated to 4.0.1 with a CD, then ran the repair with 4.1 u1......i was not paying attention, i realized i was only at 4.0.1 when i found the update CD i had previouslly used.

0 Kudos
a_p_
Leadership
Leadership

You updated with a CD? Just to be sure, did you run ESXi or ESX?

If it was ESXi - which I really hope - we'll have to find out the correct partitioning (like in the examples at http://kb.vmware.com/kb/2002461) and then re-create partitions 2 and 3 (where partition 3 is the VMFS partition). The values from the KB should actually be ok (except for the typo for the end sector of partition 8). The VMFS partition is always the last partition on the disk and its end sector usually matches the output of fdisk -lu minus 1.

Please don't start doing anything before I double checked that the values were the same for ESXi 3.5. Only answer my question above.

André

0 Kudos
GodTheHamster
Enthusiast
Enthusiast

ESXi, 3.5 updated to 4.0.1 with CD, repaired (after failed boot image from drive failure) with a 4.1 u1 CD.

0 Kudos
a_p_
Leadership
Leadership

Sorry it took some time, but I needed to setup an ESXi 3.5 host to reproduce the issue. The bad news is that partitions are different between 3.5 and 4.x, the (hopefully) good news is that the area where the VMFS partition should be located was not overwritten, because partition 2 was not created.

Again, you do the following on your own risk, even though this worked with my test system! It's still time to call VMware support!

To re-create partition 3 (VMFS) do the following:

run: fdisk /dev/disks/mpx.vmhba1\:C0\:T0\:L0

If this does not work, run esxcfg-scsidevs -c to find out whether the disk has a Linux device name (like /dev/sda) and run the fdisk command with this device name.

In fdisk enter the following commands:

  • n (new partition)
  • p (primary partition)
  • 3 (partition number)
  • 4846 (first cylinder - specific to ESXi 3.5)
  • ENTER (accept the default last cylinder)
  • t (change partiton ID)
  • 3 (partition number)
  • fb (partition type for VMFS)
  • w (write changes and exit)

To verify the values for partiton 3 run fdisk -lu again. The start value for partition 3 should be 9922560 and the System type VMFS.

Once done, rescan the vmhba in the vSphere Client (in Storage Adapters). If the values were correct, ESXi should detect the VMFS datastore.

Even if you are able to access the datastore at this point, I strongly recommend that you immediately backup the VMs and consider to reinstall the host.

Good Luck

André

GodTheHamster
Enthusiast
Enthusiast

well, now i see it in my fdisk -ul list, however when i go to rescan it does not show up, but when i go to add storage there is a VMFS datastore listed, but i cant use it, and there is 3.8Gbs of freespace that it wants to use. well, we tried, but i think in the end its gone. :|,

thanks for you help.

0 Kudos
GodTheHamster
Enthusiast
Enthusiast

Just for fun, i created a new VMFS partition using the add storage wizard. i see two VMFS paritions in fdisk, but only one shows up in vSpere client....not sure how to get the other one to add, or if i botched it and thats why i cant add it.

0 Kudos
GodTheHamster
Enthusiast
Enthusiast

hmmm, odd thing here, the new VMFS parition was created at 901 to 4845, and labled p2. the new one went from 4846 to the end. now i cant get the wizard to list any available drives to add a VMFS store. odd issues im seeing here, i think best bet here is to start over, new drives, make sure we get the suspsect one outta there and  start over.

0 Kudos
a_p_
Leadership
Leadership

For ESX(i) to recognize a VMFS partition, the partition type needs to be "FB" and the partition needs to be formatted with the VMFS file system. If it does not show up when rescanning the vmhba could mean that either the start cylinder is not correct or the file system on the partition is corrupt. However, to find out what's going on you would need to know the structure of the VMFS file system.

Btw. the reason why you see the small VMFS partition is that when you created it, it was formatted with the VMFS file system.

André

0 Kudos