We had a power outage and one of the servers running ESXi 5.1.0 went down. When I powered it back up, all the VMs running on that server could be powered up except for one, it's stuck at 95% of Power On virtual machine. I googled for this problem and the recommended solution is to kill that VM's processes, delete the .vswp files, remove the VM from inventory, and add it back again to the inventory. I've tried that several times but it doesn't fix the issue.
In order to reset the host server, I need to do a hard reboot since ESXi doesn't permit a reboot with that VM being stuck powering up at 95%. Each time I reboot the server and startup the VM, I get a new <vmname>_vmdk.REDO_XXXXX and <vmname>vmdk-delta.REDO_XXXXX matched pairing. A new pairing with unique XXXXX identifier gets created on subsequent reboots.
vmware.log for this VM is attached. Any ideas on what I should try next? Thanks.
UPDATE: The VM actually did eventually complete power on, it took 45 minutes for the open of the vmdk that was stalling the power on, see the time gap between the first log entry below and subsequent entries:
2018-11-03T03:29:42.578Z| Worker#0| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(2) fid = 134349, extentType = 0
2018-11-03T04:15:01.709Z| Worker#0| I120: DISKLIB-LIB : Opened "/vmfs/volumes/5646881e-ffd5861e-2a92-001999675f0e/Test CentOS 6.5/Test CentOS 6.5-000003.vmdk" (flags 0xa, type vmfs).
2018-11-03T04:15:01.709Z| Worker#0| I120: DISK: Disk '/vmfs/volumes/5646881e-ffd5861e-2a92-001999675f0e/Test CentOS 6.5/Test CentOS 6.5-000003.vmdk' has UUID '60 00 c2 9c 5c 96 e8 88-c8 be b0 fd 51 89 54 15'
2018-11-03T04:15:01.709Z| Worker#0| I120: DISK: OPEN '/vmfs/volumes/5646881e-ffd5861e-2a92-001999675f0e/Test CentOS 6.5/Test CentOS 6.5-000003.vmdk' Geo (133674/255/63) BIOS Geo (0/0/0)
2018-11-03T04:15:01.710Z| vmx| I120: DISK: Opening disks took 2719182 ms.
I performed a reboot of the VM and it gets through the power up normally now without getting stuck. The VM seems to operating just fine. Apparently it just fixed itself?
FYI the hardware RAID controller is showing normal status although it did report loss of cached data on the loss of power, it does not have battery backup. Requested logs are attached in case you want to take a look. Thanks guys for looking into this.
Hi,
In the vmware.log there is nothing useful, only I can see: Failed to load ~/.vmware/config
But nothing clear.
Could you attach the vmkernel.log and vmkwarning.log from the ESXi who hosts the VM?
Also, seems that your REDO log wasn't saved (because the power outage).
Before saying anything attach the logs in order to see what's going on.
I would suggest to modify the vm config file but you will lose some data there.
This may be a symptom of another problem.
If the host has a hardware RAID controller, see if the batteries are still good. Otherwise, a power failure often corrupts it and has to rebuild.
UPDATE: The VM actually did eventually complete power on, it took 45 minutes for the open of the vmdk that was stalling the power on, see the time gap between the first log entry below and subsequent entries:
2018-11-03T03:29:42.578Z| Worker#0| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(2) fid = 134349, extentType = 0
2018-11-03T04:15:01.709Z| Worker#0| I120: DISKLIB-LIB : Opened "/vmfs/volumes/5646881e-ffd5861e-2a92-001999675f0e/Test CentOS 6.5/Test CentOS 6.5-000003.vmdk" (flags 0xa, type vmfs).
2018-11-03T04:15:01.709Z| Worker#0| I120: DISK: Disk '/vmfs/volumes/5646881e-ffd5861e-2a92-001999675f0e/Test CentOS 6.5/Test CentOS 6.5-000003.vmdk' has UUID '60 00 c2 9c 5c 96 e8 88-c8 be b0 fd 51 89 54 15'
2018-11-03T04:15:01.709Z| Worker#0| I120: DISK: OPEN '/vmfs/volumes/5646881e-ffd5861e-2a92-001999675f0e/Test CentOS 6.5/Test CentOS 6.5-000003.vmdk' Geo (133674/255/63) BIOS Geo (0/0/0)
2018-11-03T04:15:01.710Z| vmx| I120: DISK: Opening disks took 2719182 ms.
I performed a reboot of the VM and it gets through the power up normally now without getting stuck. The VM seems to operating just fine. Apparently it just fixed itself?
FYI the hardware RAID controller is showing normal status although it did report loss of cached data on the loss of power, it does not have battery backup. Requested logs are attached in case you want to take a look. Thanks guys for looking into this.
Same 95% problem after a power cut, power on wouldn't complete, vSphere Client reported insufficient resources. Attempts to download the vmdk stalled, attempts to disk image stalled with disk read error. Guest file system checked ok under a working vm, removing and adding the vm to inventory didn't work. I have been able to power on by creating a new vm then adding the disks to it. A 2008 r2 winimage backup completes but without System state and Bare metal recovery, disk errors reported, chkdsk doesn't find them . Thanks for your help.