VMware Cloud Community
KennyView
Contributor
Contributor

Recovering a lost partition table on a VMFS volume

Dear all,

Since Friday we have some drastic problems on our VMWare Infrastructure (3.5).

Situatuation: 2 ESX servers enterprise edition.

We needed to reboot one of the ESX servers. After reboot one datastore was not presented to the rebooted ESX, however there was still a mapping to the LUN in the storage adapter configuration. We noticed in the vmkernel log of the active ESX, there were some errors about the partition table. Since we don't want to lose any data, we started to shut down our environment and started copying all vmdk's. During the copy the datastore also dissappear / crashed on the active ESX, so now we cannot access the datastore anymore.

I think last solution will be the recovering of the lost partition (see http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100228...), however can somebody confirm we will not lose any data on the datastore?

Has somebody already executed this on a production environment?

Or can we in some way still copy the data from the datastore to an external device ? (most preferrable solution)

Thanks in advance !

Kenny

0 Kudos
7 Replies
a_p_
Leadership
Leadership

Unless you enter wrong values, fdisk does not destroy any data on the partition itself. However, the KB you mentioned is not a general rule and without further investigation I would not recommend any trial and error.

Do you still see the partition with fdisk -lu? In case the partition was presented to a Windows host, Windows might have modified the partition type from "FB" (VMFS) to "07" (NTFS). In this case it could be sufficient to only reset the partition type and run a rescan.

If you are unsure I highly recommend you open a call with VMware to make sure you don't loose any production data.

André

KennyView
Contributor
Contributor

We have a backup from our machines, however, I am thinking about the time and diffuculity of restore (some are with RDM's, some are backupped as physical machines, ...)

When I perform a fdisk -l /dev/sdX I still see the partition but no information is located in the device boot, start, end, blocks, id and system values.

Other fdisk -l /dev/sdY shows the correct information, so I guess it will not be sufficient to only change the partition type...

At the moment we don't have support any more on our vmware infrastructure, so calling support will be no option I guess. Or is there a payable alternative? The first question they will ask, do you have backup and afterwards I guess they will follow the support kb.

All datastores are created from within ESX, so normally this will do the trick.

Start fdisk with the command fdisk /dev/sdX and press Enter.

Create the partition:
  1. Press n and press Enter to create a new partition.

  2. Press p and press Enter to select that this is a primary partition.

  3. Press 1 and press Enter to make the first partition.

  4. Press Enter to retain the default value.

  5. Press Enter again to retain the default value.

Change the partition to type fb (VMFS):
  1. Press t and press Enter.

  2. Press 1 and press Enter.

  3. Enter fb and press Enter.

    The alignement will not be necessary I guess.

    After these commands, hit refresh in the GUI (I don't find the explanation of vmkfstools -V)

    Fingers crossed ...

    I am now copying the complete VHD file (with VMFS store is it) to tape, but I really don't know how I could ever restore from this VHD.

    Thanks for your support on Sunday 😉

    0 Kudos
    grasshopper
    Virtuoso
    Virtuoso

    VMware offers support on a per incident basis starting as cheap as US $300.  I strongly recommend do this at a minimum as it's easy and you can pay online:

    http://www.vmware.com/support/services/incident.html

    Per Incident support is only available Monday through Friday.  Inform them that it's a "Severity 1" situation and you'll be in queue for no more than a few hours tomorrow morning.  You'll want to have a vm-support dump from the affected hosts and also a full list of your hardware setup including san/storage back-end, method of connection (fc, iscsi, etc), etc.

    In the meantime, check and triple check that your VMware luns are not being provisioned to non-VMware servers through bad zoning, masking, etc on the san backend.  Provide your wwn's to your san admin to facilitate this.

    http://kb.vmware.com/kb/1003973

    KennyView
    Contributor
    Contributor

    I just did the trick. Partition table is recovered, no data is lost saved me a loooooot of work.

    Thanks for the support on the forums guys !

    0 Kudos
    grasshopper
    Virtuoso
    Virtuoso

    Nice job!

    0 Kudos
    grasshopper
    Virtuoso
    Virtuoso

    BTW, you should still confirm that the zoning/masking on the san is correct or the problem may come back.

    0 Kudos