Why should I do this ?
- you accidentaly deleted a VM or a VMDK from a VMFS-datastore and want to ask an expert wether the VM or VMDK can be recovered.
- after a power-failure your datastore appears to be wiped blank (no VMs and directories are listed anymore)
- after a RAID-rebuild a datastore can no longer be mounted
- the physical disk or raid array has lost its partition-table
A VMFS header-dump may be requested in the forum when you ask for help while troubleshooting corruption of a datastore.
It will also help when important VMDKs are locked or complain about I/O errors.
Is this procedure ideal for best results ?
No - especially if you use a VMFS-volume in a cluster and more than one host have access to it this procedure is not optimal.
But the results are good enough in most cases and especially if the affected datastore is used in active production you do not want to disconnect all hosts and unmount the datastore first.
Is this procedure safe - and does it affect production ?
Yes - creating a dump does not do any harm if you store the dump in /tmp or an unaffected datastore.
What is contained in such a header-dump ?
A dump like described here contains the hidden .sf files that are usually located in the first 2 gb of a datastore.
Without the data stored in this area a large vmdk file would be just a large pile of fragments.
1. Required: root-access to an ESXi-host via ssh
2. Identify the device that corresponds to the affected datastore:
login with root account
cd /dev/disks
ls -lisa | grep -v vml
In many cases you can identify the correct device by inspecting the referenced filesize – typically several hundred of GBs or several TBs.
If several datastores have the same size – use
esxcfg-scsidevs -m
for a more detailed description of the available devices.
To create a dump you need to know the Device and the partNum
So if you figured out that the corrupted datastore appears in /dev/disks as
naa.1234567812345678:1 (just an example)
then Device is naa.1234567812345678 and partNum is 1
For all VMFS-versions the procedure is the same - but note that the size of the dump is different.
Case A: you have another unaffected datastore that can be used to store the dump.
In this case you can store the dump in this location:
/vmfs/volumes/ANOTHER-UNAFFECTED-DATASTORE/
VMFS 3 and 5
dd if=/dev/disks/Device:partNum bs=1M count=1500 of=/vmfs/volumes/ANOTHER-UNAFFECTED-DATASTORE/vmfs-header-dump.1500
VMFS 6
dd if=/dev/disks/Device:partNum bs=1M count=2000 of=/vmfs/volumes/ANOTHER-UNAFFECTED-DATASTORE/vmfs-header-dump.2000
VMFS 6 used by ESXi 7
dd if=/dev/disks/Device:partNum bs=1M count=2500 of=/vmfs/volumes/ANOTHER-UNAFFECTED-DATASTORE/vmfs-header-dump.2500
Case B: you do NOT have another unaffected datastore and have to use /tmp
Carefully watch the commandline - if you see a message "short write ..." then the dump is incomplete - use the procedure in case C
VMFS 3 and 5
dd if=/dev/disks/Device:partNum bs=1M count=1500 | gzip -c > /tmp/vmfs-header-dump.1500.gz
VMFS 6
dd if=/dev/disks/Device:partNum bs=1M count=2000 | gzip -c > /tmp/vmfs-header-dump.2000.gz
VMFS 6 used by ESXi 7
dd if=/dev/disks/Device:partNum bs=1M count=2500 | gzip -c > /tmp/vmfs-header-dump.2500.gz
Case C: you do NOT have another unaffected datastore and only very little free space /tmp
In this case you have to split the dump in several pieces so that each one of them fits into /tmp.
After each command use WinSCP or any other SCP-client and download each piece after you created it.
Once you downloaded the dump-part , clean up /tmp and run the next command.
dd if=/dev/disks/Device:partNum bs=1M count=500 skip=0 | gzip -c > /tmp/split-vmfs-header-dump.0.gz
download /tmp/split-vmfs-header-dump.0.gz and clean up /tmp
dd if=/dev/disks/Device:partNum bs=1M count=500 skip=500 | gzip -c > /tmp/split-vmfs-header-dump.500.gz
download /tmp/split-vmfs-header-dump.500.gz and clean up /tmp
dd if=/dev/disks/Device:partNum bs=1M count=500 skip=1000 | gzip -c > /tmp/split-vmfs-header-dump.1000.gz
download /tmp/split-vmfs-header-dump.1000.gz and clean up /tmp
dd if=/dev/disks/Device:partNum bs=1M count=500 skip=1500 | gzip -c > /tmp/split-vmfs-header-dump.1500.gz
download /tmp/split-vmfs-header-dump.1500.gz and clean up /tmp
dd if=/dev/disks/Device:partNum bs=1M count=500 skip=2000 | gzip -c > /tmp/split-vmfs-header-dump.2000.gz
download /tmp/split-vmfs-header-dump.2000.gz and clean up /tmp
For VMFS 3 and 5 you need the first 3 parts
For VMFS 6 you need the first 4 parts
For VMFS 6 used by ESXi 7 you need to run 5 commands.
For all VMFS versions also dump the first MB of the device which contains the MBR or GPT partitiontable.
dd if=/dev/disks/Device bs=1M count=1 skip=0 of=/tmp/mbr-gpt.bin
If you had to use gzip while creating the dump unpack the gz on your admin host and verify the size of the dump.
A one piece dump should have a size of 1500mb for VMFS 5 or 3, 2000mb for VMFS 6 and 2500mb for VMFS6 / ESXi 7
All parts of a split dump should have a size of 500mb
If you want to send the dump you just collected to someone to look into it please also create a readme.txt with some basic information:
- short summary of the history of the datastore
- short summary of the eventsthat caused the corruption
- if you accidentaly deleted a vmdk - please add the size and guestOS
- your contact information
Then create an empty directory with a good name like "yourName-datastore-name" and compress the complete directory with an effective packer like 7zip or Rar
The directory should contain:
VMFS 5 and earlier:
mbr-gpt.bin
vmfs-header-dump.1500 (or 3 files split-vmfs-header-dump.0 to split-vmfs-header-dump.1000 )
readme.txt
VMFS6:
mbr-gpt.bin
vmfs-header-dump.2000 (or 4 files split-vmfs-header-dump.0 to split-vmfs-header-dump.1500 )
readme.txt
VMFS6 / ESXi 7:
mbr-gpt.bin
vmfs-header-dump.2500 (or 5 files split-vmfs-header-dump.0 to split-vmfs-header-dump.2000 )
readme.txt
Typical size of a VMFS header dump compressed with 7zip or Rar is something between 10Mb and 800Mb.
You may want to check wether the dump contains any confidential data that you are not allowed to share.
To evaluate which data is contained in a VMFS header dump download the tool strings.exe from
https://technet.microsoft.com/en-us/sysinternals/strings.aspx
after download unzip strings.exe and copy it to the same path that already has replace with your name.1536
Open a cmd-box and execute
strings.exe dump-file > strings.txt
Search through strings.txt.
The dump contains vmx-files and log-files which may contain client names and other sensitive data.
There is a Knowledgebase-article that discusses the same topic – see
https://kb.vmware.com/kb/1020645
Unfortunately that KB is outdated and has not been edited to be useful for VMFS 6 used by ESXi 7
Disclaimer: the procedure for VMFS 6 used by ESXi 7 should work in most cases.
But as ESXi 7 does not write all metadata as a single block to the start of the volume this procedure may not be enough.
In some cases it will be necessary to also add a copy of the hidden file .sbc.sf
Ulli Hankeln
#########################################################################
contact:
skype : sanbarrow
#########################################################################
Important:
Header-dumps for VMFS 6 should be at least 3000 MB in size.
The 2000 MB I recommended previously are just too small too often.
DO NOT USE RAR PLEASE
ALWAYS INCLUDE THE MBR/GPT FILE
Document the command you used to create the dump.