Identifying an ESX host with a process holding a VM lock

Identifying an ESX host with a process holding a VM lock

One of the most troublesome tasks as an administrator of ESX is when a host has a lock on a VM and you are unable to start that VM on another host. This can occur, for example, after a host failure where the lock has not been released correctly.

The following procedure steps through each stage of identifying which host is running the process that is holding the lock, and killing it.

1. With the hostname of the last known ESX host where the VM was running, log on to the console.

2. Run the following commands:

vmkfstools -D /vmfs/volumes/path/to/vm.vmx

less /var/log/vmkernel and scroll to the bottom, you will see output like shown below:

Dec 17 12:00:10 host5 vmkernel: 2:00:11:13.723 cpu6:1038)FS3: 130: <START vmware-6.log>

Dec 17 12:00:10 host5 vmkernel: 2:00:11:13.723 cpu6:1038)Lock [type 10c00001 offset 45823900 v 21, hb offset 6290651

Dec 17 12:00:10 host5 vmkernel: gen 26595, mode 1, owner 58422081-629bc75a-9826-00173b845ca9 mtime 10611865]

Dec 17 12:00:10 host5 vmkernel: 2:00:11:13.723 cpu6:1077)Addr <4,

332,6>, gen 20, links 1, type reg, flags 0×0, uid 0, gid 0, mode 644

Dec 17 12:00:10 host5 vmkernel: 2:00:11:13.723 cpu6:1077)len 23973, nb 1 tbz 0, zla 2, bs 65536

Dec 17 12:00:10 host5 vmkernel: 2:00:11:13.723 cpu6:1077)FS3: 132: <END vmware-6.log>

4. The host with the lock is highlighed above in red, the last part of the ID is all that is needed, for the above it would be 00173b845ca9

5. Run the following command on each ESX host in the cluster to get the system uuid of the ESX host.

esxcfg-info | grep -i ’system uuid’ | awk -F ‘-’ ‘{print $NF}’

6. When the ESX host has been located which matches the uuid owner, log on to the host and run the following command:

ps -elf|grep <vmname>

You should see something like:

4 S root 5650 1 0 65 -10 – 737 schedu Dec17 ? 10:44:52 /usr/lib/vmware/bin/vmkload_app

/usr/lib/vmware/bin/vmware-vmx -ssched.group=host/user/pool2 -@

pipe=/tmp/vmhsdaemon- 0/vmxf6fb22ef3b8a3222;vm=vmxf6fb46ef7b4b3223

/vmfs/volumes/440d24d6-47076d77-a7c3-021c98aed2d/testvm/testvm.vmx0

7. Kill the process by kill -9 <PID> – in the case above just type: kill -9 5650

8. The lock on the VM should now be released and the VM should be able to be powered on.

Comments

This steps works for me.

Version history
Revision #:
1 of 1
Last update:
‎12-27-2009 01:01 AM
Updated by: