Re: Complete Locked VM after power failure

Be_at_EZ · ‎01-26-2014

My VMware host had a power failure yesterday morning and one of my VM's seem to suffer form this.

My setup is in some way simple and is a ESXi host with two VM's and i based as this:

ESXi 5.1 run on USB with two separate storage disks. One disk for one VM (still working) and another disk with the other VM and a .locker folder (not working).

When I discovered the crash I had to remove the VM's from the inventory through the vmware-client as the were shown as "unknown". Then I tried to add both VM's to the inventory again - the first successfully and then started my problems with the other VM.

When I tried to add this to the inventory I didn't have the option to do this as it were "grayed out"!

Then I started google around and discovered that I might had a problem with a locked file in the VMDIR and I did find a .lck file. And I did some test (vmkfstools -D) and discover that my the VM were locked but I got some different readings like this:

vmkfstools -D /vmfs/volumes/516ac177-2009a248-6f97-001617628430/mail.XXXXXX.com/mail.XXXXXX.com.vmdk

Lock [type 10c00001 offset 18870272 v 168, hb offset 3981824

gen 115, mode 0, owner 00000000-00000000-0000-000000000000 mtime 1453

num 0 gblnum 0 gblgen 0 gblbrk 0]

Addr <4, 20, 30>, gen 80, links 1, type reg, flags 0, uid 0, gid 0, mode 600

len 533, nb 0 tbz 0, cow 0, newSinceEpoch 0, zla 4305, bs 8192

vmkfstools -D /vmfs/volumes/516ac177-2009a248-6f97-001617628430/mail.XXXXXX.com/mail.XXXXXX.com.vmx

Lock [type 10c00001 offset 18810880 v 279, hb offset 3981312

gen 25, mode 1, owner 522ca7ed-9b3b0640-0fcd-001617628430 mtime 120587

num 0 gblnum 0 gblgen 0 gblbrk 0]

Addr <4, 20, 1>, gen 2, links 1, type reg, flags 0, uid 0, gid 0, mode 100755

len 2899, nb 1 tbz 0, cow 0, newSinceEpoch 1, zla 2, bs 8192

My MAC address is identically to 001617628430

Then I tried to remove the .lck file in the vmware-client but were not allowed to do this (I think the message were "GOT ERROR"). Then I tried to remove the .lck file through SSH and got another error this time "Input/output error".

After a long time poking around on the internet searching for the golden answer I had realised that were not able to do anything with any of the files in the VMDIR and every time I tried to do something in SSH I got errors like "Input/output error", "Input/output error (327689)" and thorough another test I discovered also that all the files gave me the error "disk chain is not consistent"!

I have also noised that I weren't able to copy any of the files to another location or download them to my computer.

I have also noised there seem to be similar problems regarding the .locker folder. Here I'm not able to open (ex. logs), copy or download any files! Therefore is also suspect there might be at problem concerning this issue.

I don't suspect that the disk have a hardware problem in fact I have som old (inactive) VM's on this "problem disk" but I have no problem in coping or downloading these files!

As often (for us all) this is a critical loss if I can't startup this VM again or at least in some way get my data from these files as this hold my entire mail-system include contacts, calender etc.

I feel my hands are bound and I'm heading down a dead end.

If your can give me a solution or shed some light I will be more than pleased 🙂

All the best regards to all of you

Be_a_EZ

Ethan44 · ‎01-26-2014

Hi

Welcome to the communities.

Could you please review log form below path and link.

/var/log/vmkernel log file

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100423...

Take care!

Be_at_EZ · ‎01-26-2014

Hi Ethan44

Am I looking the right place because I can't seem to find the log file (vmware.log) your asking for:

Here is what I see in /var/log/

boot.gz	esxcli.log	smbios.bin	vmkdevmgr.log
configRP.log ipmi	sysboot.log	vmware

slahy · ‎04-17-2014

I had the same issue and managed to resolve it by running the following cmd on the partition of the drive while it was unmounted for the host (my datastore was a local drive in the host)

(I removed the affected vms from inventory not sure if this is necessary)

unmount the datastore from the host

ssh into the host as root

voma -m vmfs -d /vmfs/devices/disks/disk_partation_name

more details can be found hereVMware KB: Using vSphere On-disk Metadata Analyzer (VOMA) to check VMFS metadata consistency

remount the datastore again.

reboot the host

readd vm back into inventory and power on.