VMware Cloud Community
RKuchukbaev
Enthusiast
Enthusiast

Recovering VMFS

Hello all!

  Today morning one of our system administrator tried to change LUN ID for mapped hosts on our storage IBM v3700. He changed it from 0 to 10. After that he rescaned HBA's and LUN was discover with new ID (10), BUT we lost all of our data on this datastore. I mean: before he made this "change" datastore was ~6TB. Now we have strange empty datastore 2Tb without any VMs.

2.png

1.png

It looks like we have lost partition or something...

~ # partedUtil getptbl /dev/disks/naa.60050763008080185800000000000001

gpt

800302 255 63 12856852480

1 2048 4294961684 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

~ # partedUtil getptbl /dev/disks/naa.60050763008080185800000000000001:1

unknown

267348 255 63 4294959637

So, is it possible to recover it? Thanks for any reply.

0 Kudos
2 Replies
admin
Immortal
Immortal

Restoring the VMFS ----------------------------------------------- To summarize the information in that document, to restore the VMFS you need to use the command fdisk to rebuild the partition table. Then move the start block to the proper alignment.

Then refresh and re-scan to access the partitions.

Using fdisk, however, can be dangerous and requires caution. Instructions for using fdisk:  

- Add a new primary partition number 1     - Take default first and last cylinders     - Change a partition's system id to fb or the VMFS partition id     - Move the beginning of the data in the partition to have an offset of 128 used for VMFS     -

Write the new partition table to the disk and exit     - Repeat for all lost VMFS partitions.

This is what those instructions translated into for /dev/sda and /dev/sdc within my ESX host's service console:   

# fdisk /dev/sda     Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel   

Building a new DOS disklabel.     Changes will remain in memory only, until you decide to write them.

After that, of course, the previous content won't be recoverable.     The number of cylinders for this disk is set to 39162.   

There is nothing wrong with that, but this is larger than 1024, and could, in certain setups, cause problems with two things: software that runs at boot time (e.g., old versions of LILO) and/or booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK)   

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)     Command (m for help): n     Command action     e extended     p primary partition (1-4)     p     Partition number (1-4): 1     First cylinder (1-39162, default 1):

    Using default value 1     Last cylinder or +size or +sizeM or +sizeK (1-39162, default 39162):     Using default value 39162     Command (m for help): t     Selected partition 1     Hex code (type L to list codes): fb     Changed system type of partition 1 to fb (Unknown)  

  Command (m for help): x     Expert command (m for help): b     Partition number (1-4): 1     New beginning of data (63-629137529, default 63): 128     Expert command (m for help): w     The partition table has been altered!

0 Kudos
astr0ty
Contributor
Contributor

After power loss I've got the same situation, one of my SSD,s looked like:

~ # partedUtil getptbl /vmfs/devices/disks/naa.50026b7225115b64

unknown

29185 255 63 468862128

Other similar one was fine:

~ # partedUtil getptbl /vmfs/devices/disks/naa.50026b72260069c8

msdos

29185 255 63 468862128

1 2048 468857024 251 0

Didn't find exact solution in vmware cases, but obviously it was:

~ # partedUtil setptbl /vmfs/devices/disks/naa.50026b7225115b64 msdos "1 2048 468857024 251 0"

Having the exact SSD's with identical partition tables made it easier to recreate)

0 Kudos