VMware Cloud Community
fusebox
Enthusiast
Enthusiast

Not able to find the World ID For The Orphaned VM In ESX 3.5 -- Lock on the vmdk files

This issue is an offshoot of one of the existing threads , in short the issue I am going to describe is a bit similar to that.

I have a VM who's datafiles I am not able to delete. Earlier I had a similar issue with another VM and I was able to fix that issue with the steps mentioned in the above given thread. Now,the issue is the same with another VM. But,the catch is that I am not able to find the World ID of the VM which I need to kill to release the locks on the vmdk files inorder to delete them either from the VC or the esx host directly.I was able to delete the vmx,nvram and vm log files from the datastore but not the vmdk files as it was reporting an error,"Device or File busy". I am pasting the relevant outputfor clear understanding.[root@nyvpapmf02 nywpapap04-vm-staging]# rm -Rf *

rm: cannot remove `nywpapap04-vm-staging_1-flat.vmdk': Device or resource busy rm: cannot remove `nywpapap04-vm-staging-3c09b096.vswp': Device or resource busy rm: cannot remove `nywpapap04-vm-staging-flat.vmdk': Device or resource busy

# cat /proc/vmware/vm/*/names |grep nywpapap04-vm-staging

vmid=1373 pid=-1 cfgFile="/vmfs/volumes/47e136dc-6ceb5aa0-fbf2-0017a43ba239/nywpapap04-vm-staging/nywpapap04-vm-staging.vmx" uuid="50 0f 4e 91 34 a4 c6 11-8f 0d 89 56 da 31 c0 38" displayName="nywpapap04-vm-staging"

# ls

alloc disk names userRPC -


> the cpu directory is missing here.Hence I cannot obtain the World ID pid to kill using the command ,"vmkload_app -k 9 <world id>"

Below is the output of the ,"vm-support -x" command for your reference:

  1. vm-support -x

VMware ESX Server Support Script 1.29

Available worlds to debug:

vmid=1080 nywpapww03

vmid=1206 nywpapdb01

vmid=1373 nywpapap04-vm-staging

vmid=1404 nywpapww02-vm-staging

vmid=1421 nywpapap03-vm-prod-staging

Then I tried the "vm-support -X" command to dump the hung vmid for creating a core dump of this vmid inorder to send it to the VMware support incase. That apart, below are the output from the /var/log/vmkernel for this vm

# grep nywpapap04-vm-staging vmkernel

Jun 12 12:59:38 nyvpapmf02 vmkernel: 85:00:05:52.413 cpu2:1361)World: vm 1362: 895: Starting world vmm0:nywpapap04-vm-staging with flags 8

Jun 12 12:59:38 nyvpapmf02 vmkernel: 85:00:05:52.413 cpu2:1361)Sched: vm 1362: 5333: adding 'vmm0:nywpapap04-vm-staging': group 'host/user': cpu: shares=-1 min=-1 minLimit=-1 max=-1

Jun 12 12:59:38 nyvpapmf02 vmkernel: 85:00:05:52.465 cpu2:1361)World: vm 1363: 895: Starting world vmm1:nywpapap04-vm-staging with flags 8

Jun 12 12:59:38 nyvpapmf02 vmkernel: 85:00:05:52.466 cpu2:1361)World: vm 1364: 895: Starting world vmm2:nywpapap04-vm-staging with flags 8

Jun 12 12:59:38 nyvpapmf02 vmkernel: 85:00:05:52.466 cpu2:1361)World: vm 1365: 895: Starting world vmm3:nywpapap04-vm-staging with flags 8

Jun 12 13:04:13 nyvpapmf02 vmkernel: 85:00:10:27.363 cpu1:1361)World: vm 1373: 895: Starting world vmm0:nywpapap04-vm-staging with flags 8

Jun 12 13:04:13 nyvpapmf02 vmkernel: 85:00:10:27.364 cpu1:1361)Sched: vm 1373: 5333: adding 'vmm0:nywpapap04-vm-staging': group 'host/user': cpu: shares=-1 min=-1 minLimit=-1 max=-1

Jun 12 13:04:13 nyvpapmf02 vmkernel: 85:00:10:27.398 cpu2:1361)World: vm 1374: 895: Starting world vmm1:nywpapap04-vm-staging with flags 8

Jun 12 13:04:13 nyvpapmf02 vmkernel: 85:00:10:27.399 cpu2:1361)World: vm 1375: 895: Starting world vmm2:nywpapap04-vm-staging with flags 8

Jun 12 13:04:13 nyvpapmf02 vmkernel: 85:00:10:27.399 cpu2:1361)World: vm 1376: 895: Starting world vmm3:nywpapap04-vm-staging with flags 8

Jun 12 13:46:16 nyvpapmf02 vmkernel: 85:00:52:30.642 cpu0:1361)UserDump: 1410: Dumping cartel 1361 (from world 1361) to file /vmfs/volumes/47e136dc-6ceb5aa0-fbf2-0017a43ba239/nywpapap04-vm-staging/vmware-vmx-zdump.000 ...

Jun 12 13:48:16 nyvpapmf02 vmkernel: 85:00:54:30.252 cpu3:1361)UserDump: 1410: Dumping cartel 1361 (from world 1361) to file /vmfs/volumes/47e136dc-6ceb5aa0-fbf2-0017a43ba239/nywpapap04-vm-staging/vmware-vmx-zdump.001 ...

Then I go to the /proc/vmware/vm/1361/cpu directory and try to find out theworld id for this and below is the output:[root@nyvpapmf02 cpu]# pwd

/proc/vmware/vm/1361/cpu

# ls

run-state-histo status wait-state-histo wait-stats

# more status

vcpu vm type name uptime status costatus usedsec syssec wait waitsec idlesec readysec cpu affinity htsharing min

max units shares group emin extrasec

1361 1361 U vmware-vmx 8989.681 WAIT NONE 8.393 0.146 UJN 8978.596 0.000 2.739 2 0,1,2,3 any 0

100 pct 1000 vm.1361 100 3.254

# /usr/lib/vmware/bin/vmkload_app -k 9 1361

#

Any idea how to determine the stale world id and then remove these files?

0 Kudos
1 Reply
fusebox
Enthusiast
Enthusiast

Looks like the only solution to such a problem is a ESX server reboot. Found this info in the KB Home.

0 Kudos