VMware Cloud Community
larstr
Champion
Champion

Device or resource busy

A recently upgraded ESX system (3 hosts) had a SAN controller failover. This caused many VMs to die and get locked files. It's not only the disk files that are locked for these VMs, but also the vmx files, config files and everything. Other VMs on the same LUN (and hosts) are working fine.

# vmware -v

VMware ESX 4.0.0 build-164009

# more vmware.log

vmware.log: Device or resource busy

# cat notes.vmx

cat: notes.vmx: Device or resource busy

ls works fine and I can even create a new file within this directory that I can edit, read and delete just fine. Unregistering/Reregistering the VM is not possible since even the vmx file is not readable.

A power outage has also caused everything to power down some while after this, so there are no hanging processes.

I've read a couple of postings (in the 3.5 forum, and I wonder why I can't find them anymore) suggest that one should connect a single ESX 3.0.2 server to the LUN to fix this.

Any other hints? Will try the 3.0.2 route now..

Lars

0 Kudos
4 Replies
Banan
Contributor
Contributor

Did you ever find out what caused this and/or managed to fix this? We got the exact scenario and I haven't been able to solve it yet. We're running latest ESX 4.x and all vm's have latest Vmwaretools + all vm's have been upgraded to 4.x virtual hardware. We never had this problem on 3.5.x. The problem came with Esx 4.x. Tried a number of possible solutions after searching Google and communities here. Nothing yet has been able to solve the issue.

root@vmware1 ~# more "/vmfs/volumes/47c127d1-f682b3f0-b349-001c23c16715/Mail1/Mail1.vmx"

#!/usr/bin/vmware

...

root@vmware2 ~# cat "/vmfs/volumes/47c127d1-f682b3f0-b349-001c23c16715/Mail1/Mail1.vmx"

cat: /vmfs/volumes/47c127d1-f682b3f0-b349-001c23c16715/Mail1/Mail1.vmx: Device or resource busy

So all ESX hosts except the one the vm is currently running on has the ability to cat log files/vmx/whatever. Vmotion is working perfectly and no host is having SAN issues or anything. All our LUN's are available on each and every host. It's just not possible to access any files on a vm except from the one host it's actually running on.

If you have anything on this issue, please respond :). Thanks a lot!.

More info:

# grep "#" -B 1 /vmfs/volumes/SAN_System08/Subversion/Subversion.vmx

#!/usr/bin/vmware

# grep "#" -B 1 /vmfs/volumes/SAN_System08/Subversion/Subversion.vmx

grep: /vmfs/volumes/SAN_System08/Subversion/Subversion.vmx: Device or resource busy

# tail /var/log/vmkernel

Sep 10 13:21:25 vmware4 vmkernel: 5:15:33:44.339 cpu2:4217)MigrateNet: vm 4217: 1130: Accepted connection from <172.20.0.89>

Sep 10 13:21:25 vmware4 vmkernel: 5:15:33:44.339 cpu2:4217)MigrateNet: vm 4217: 1144: dataSocket 0x4100b80229c0 send buffer size is 263536

Sep 10 13:21:34 vmware4 vmkernel: 5:15:33:53.481 cpu1:5094)VMotion: 2606: 1252581680483893 S: Stopping pre-copy: only 1086 pages were modified, which can be sent within the switchover time goal of 0.500 seconds (network bandwidth ~133.043 MB/s)

Sep 10 13:21:34 vmware4 vmkernel: 5:15:33:53.494 cpu2:5093)VSCSI: 5850: handle 8294(vscsi0:0):Destroying Device for world 5094 (pendCom 0)

Sep 10 13:21:34 vmware4 vmkernel: 5:15:33:53.528 cpu3:5093)VSCSI: 5850: handle 8295(vscsi0:1):Destroying Device for world 5094 (pendCom 0)

Sep 10 13:21:35 vmware4 vmkernel: 5:15:33:54.153 cpu0:5173)VMotionSend: 2909: 1252581680483893 S: Sent all modified pages to destination (network bandwidth ~114.176 MB/s)

Sep 10 13:51:51 vmware4 vmkernel: 5:16:04:09.953 cpu3:4105)FS3: 2762: Checking liveness of lock holders [type 10c00001 offset 65437696 v 566, hb offset 3723264

Sep 10 13:51:51 vmware4 vmkernel: gen 821, mode 1, owner 4a9698cf-23fdbb28-ef6f-001c23c16715 mtime 81114]on volume 'SAN_System08'.

Sep 10 13:51:55 vmware4 vmkernel: 5:16:04:13.955 cpu3:4105)FS3: 2854: Lock [type 10c00001 offset 65437696 v 566, hb offset 3723264

Sep 10 13:51:55 vmware4 vmkernel: gen 821, mode 1, owner 4a9698cf-23fdbb28-ef6f-001c23c16715 mtime 81114] is not free on volume 'SAN_System08'

#

0 Kudos
bagers32
Contributor
Contributor

hello,

i have same problem.... cannot read .vmx on esx hosts, only on ESX, where VM running...

0 Kudos
Banan
Contributor
Contributor

Got reports back from Vmware eventually and they said this is expected behaviour in Esx 4.0.. Smiley Sad.

0 Kudos
nmisabcn
Contributor
Contributor

Hi, I opened a case with Support and told me that this is the expected behaviour in ESX 4.0 as Banan said in this post.

The solution is: Open or read these files from the cluster's host that runs the Virtual Machine in the Cluster.

For exemple:

# vmware-cmd -l
/vmfs/volumes/46ebd8ee-db5cfe51-936f-0010181c3955/SERVER1/SERVER1.vmx
/vmfs/volumes/45ef19dd-2770078f-05c9-0010181c3955/SERVER2/SERVER2.vmx
/vmfs/volumes/48256a14-3410341c-7d60-0010181c399f/SERVER3/SERVER3.vmx

This HOST(ESX01) can only access to this server's list. SERVER1, SERVER2 and SERVER3

I hope that this information can help you.

0 Kudos