Hi there,
last week one of our Datastores (internal SATA HDD) in our ESXi Server suddenly was gone and couldn't be accessed. The Machines on that Datastore were still shown as Powerd On in VSphere but not responding. We then rebooted the Server and the Datastore was back to normal. But now the Machines residing on that Datastore where marked as "Unknown" and
couldn't be started. We tried to add them to the Inventory again but that failed because the Files couldn't be accessed.
We therefore tried touch on the files of one of the machines which gives "Invalid Argument".
vmkfstools -D <path to one of the locked files>
gives the following response:
Lock [type 10c00001 offset 6375424 v 245, hb offset 3776512
gen 233, mode 1, owner 4dd7cd9e-a5826779-5fd1-003048c30e8c mtime 848]
Addr <4, 5, 17>, gen 244, links 1, type reg, flags 0, uid 0, gid 0, mode 644
len 303353, nb 1 tbz 0, cow 0, zla 1, bs 4194304
We have only one ESXi which is why we think that this one is the Lock-Holder (the Mac-Adress also matches).
We tried solving the issue by following the VMware KB Article: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=10051
Unfortunately that didn't help, vm-support -x yields (because we rebootet the server already) "There are no Worlds to Debug"
We then tried the following:
vmkfstools -L release / lunreset /vmfs/devices/disks/....
Because we didn't really know what to input there, we tried some of the disks in the devices/disks Directory.
VSphere Client shows for the Datastore: Local AMCC Disk (t10.AMCC9QJ0ZCFYD87EFE007FC4):1
for which there is also a file under devices/disks but a release / lunreset on it didn't help releasing the lock.
VSphere Client also shows the following (under Path): vmhba1:C0:T1:L0
But we couldn't find such a file under devices/disks.
Last thing we tried is using vmkfstools -B but that didn't help either.
vmkfstools -D shows the same message as above.
Is there another possibility to release the lock?
Our current workaround is to boot a live-cd of ubuntu and use the vmfs-tools package to copy the files from the Datastore.
Unfortunately the driver for vmfs currently only supports reading, this is why it is of no use when it comes to deleting the unaccessible files/machines...
Thanks in advance for replies, asgaroth
Well, I hope you didn't have any cirtcal data on the VMFS.
I would do a Resignature of the VMFS volume. Assuming it see's the disk, and you haven't moddied it.
I have a blog post from 08, talking about my experiance.
http://itblog.rogerlund.net/2008/11/datastore-missing-but-disk-array.html
Here is a Good KB, Resignaturing VMFS3 volumes on VMware ESX 3.x via the VMware Infrastructure Client
Give that a shot, and let us know.
Roger Lund
Minnesota VMUG leader
Blogger
VMware and IT Evangelist
My Blog: http://itblog.rogerlund.net & http://www.vbrainstorm.com
1. If your vSphere client is showing the disk as vmhba1:C0:T1:L0, then you can use "esxcfg-scsidevs -c" to get the device "/vmfs/devices/disks/mpx.vmhba1:C0:T0:L0 "
2. And then you can use "vmkfstools -L lunreset /vmfs/devices/disks/mpx.vmhba1:C0:T0:L0" to reset your LUN
It should be able to make your datastore back online and your VMs should be abled to be powered on
Hey there,
thanks for your replies and sorry that i didn't reply earlier.
We indeed had some kind of critical data on that Datastore because our backup strategy wasn't fully deployed yet.
But we could recover the Machines with the Live CD and the VMFS Driver.
The Problem still exists because now we don't need the machines anymore (because we now have a working copy)
but can't delete them because of the locks which still exist.
Solution for this will be formatting the Datastore but that is no acceptable solution if this problem should persist.
So where still searching.
@rlund
About the resignaturing... i didn't understand everything of that but what is indicated in the KB article is that it only applies to ESXi3.x
and not to ESXi4, is that right?
Also we only have the VSphere Client and not the Infrastructure Client.
@LucyLi
As we stated in the first post, we don't have that vmhba1 thing. And we also tried the lunreset already on most of the files in /vmfs/devices/disks
What you are suggesting only works for ESXi3.x because as i understood it, they changed the way the disks are named.
I think the thing i saw, named "vmhba1" is the Raid Controller where the Local Disks are plugged in:
Cheers, asgaroth