My ESXi 3.5 machine runs 8-10 VMs (Win2k3 and WinXP) normally. At the moment, 5 of them are complaining that they cannot Power On. They seem to start and then complain "Could not power on VM: no swap file". I had a look with the data browser. It's a small installation, so the vswp files ought to be in the same directory as the vmx file (I did not inttionally put them anywhere else). Of course I don't see a vswp file there because the machine is not running. I don't know enough about the vmx file structure to identify if anything is wrong in the specifications. I have downloaded one of the vmx files and attached it here. Please either tell me what to change in that vmx file, or suggest another approach to get the machines to start.
Regards - Glen
Have you changed anything on the host that might cause an error? Or to the VMs?
Last thing I changed was a day earlier. I increased the memory allocated to one of the VMs from 256 Mb to 512 Mb. The machine that I made the change to is running OK.
The host machine has 8 Gb and the total of all the VMs allocated memory is 5256 Mb. The datastore is 1.81 Tb and the free space is 1.08 Tb.
Regards - Glen
Take a look at this post. It might help.
I selected some potential solutions from that post - sorry that I hadn't located it in my search before I posted my question.
There is only my one host machine involved in this. I am able to reboot that host server whenever I want so I did that. Now I have only 3 WinXP machines that power on. All other machines (Win2k3 and WinXP) have the "no swap file" error when I try to start them. So terminating all the processes and restarting did not help any. I am under the impression that a reboot would have cleared all process locks or file locks in the system, right?
Then I selected one of the servers and removed it from the inventory, then added it back in to the inventory. I did not move, change or rename any files in the data store. That did not resolve the issue either.
There are no VMs identified as orphaned, so no help there.
The post suggested migrating the troubled VM, so I did that, but with 1 host and 1 datastore that may not do much. There is a choice of moving the files or not, and I tried both ways. That did not resolve the issue either.
At that point I ran out of things from the post to try. But it was a couple of good attempts. Any more ideas?
Regards - Glen
I have attached the vmware.log for the most recent attempt to power on one of the servers. Nothing in there jumps out at me, and there's not too much at the end of it to identify what the error might have been. Is there a diagnostic/verbose mode that will put more details in the log file?
Regards - Glen
usuallly indicates that the VM is still running in memory. Try restarting mgmt agents to clear it out of memory.
Here is a recipe that can help you:
---
iSCSI SAN software
http://www.starwindsoftware.com
Dan - I rebooted the one and only host machine so that ought to clear up any running VMs, right?
Regards - Glen
Paul - thanks. That article talks about a case with multiple host machines in a HA cluster. I only have one host machine, and I rebooted it, so there should be no processes running that own any files or locks.
Regards - Glen
Have you tried to "recreate" one of these faulty VMs? Ofc meaning add new VM and select use existing disks...?
Just to make sure that a fresh home dir and .vmx file are created... (MAC address is also recreated with a new value, but Im sure you know this),..
/Rubeck
That was interesting! Yes, I created a new VM (named DC3) and "used existing disk" and it powered on and ran just fine. I think I need to fiddle with the registration in the domain and check some access permission associated with the machine name, but it basically looked good. The new directory has in it just the vmx, vmxf, vmsd, nvram and log files and the vswp was created in that new directory and disappeared when it was shutdown. The vmx file contains these 2 lines (and more):
scsi0:0.fileName = "/vmfs/volumes/4ab6c4ff-de1e6ea7-d316-0024e8734364/DC2/DC2.vmdk"
(this points to the disk over on one of the VMs that won't power on)
sched.swap.derivedName = "/vmfs/volumes/4ab6c4ff-de1e6ea7-d316-0024e8734364/DC3/DC3-ad447193.vswp"
(which tells it where to create the new swap file when it runs)
Looks fine. Now question #1 is - what's the best way to use this technique to "recreate" all of my damaged VMs?
And question #2 is - why did this happen in the first place and what do I do to prevent it from happening again?
Regards - Glen
It seems that may be some faulty entry in the .vmx file of the faulty VM.... Does deleting the "sched.swap.derivedName" entry from the file help?
If so, you could do this in all of the faulty VMs and do a shut down/ power- on when wanted, and this would do re-reading of the .vmx file
In theory the issue might happen if the hostd process died when to altering .vmx files... (any hostd dumps in your /var/core?), but I truely don't know..
/Rubeck
I deleted the derivevdName line from the vmx file, tried to power it on, got the same error message. Checking the vmx file now ... there was no line added to it to put the dervivedName line back. If I edit the settings for that VM and check Options it says swapfile location = use default settings.
Regards - Glen
Any chance you could provide a "ls -lh" from the faulty VM volume and dir?
/Rubeck
I can do anything! Sorry to be ignorant, but you're going to have to lead me by the hand a little. I haven't had to get into the guts of my ESX before and I haven't touched a Unix of any sort in over 10 years. First step is to get an open console session into the Unix host, I guess. I have admin credentials for it, so that's no problem. I have ESX 3.5i, so how do I get that session going?
Regards - Glen
No prob, Glen..
Do you have access to the host using SSH? (unsupported)
If so print the output from:
ls -lh /vmfs/volumes/
/Rubeck
Just downloaded a copy of PuTTY. Tried to connect to the Host over SSH and got "Network Error: connection refused". It hadn't even asked me for an ID and password yet. Sorry to step back to something so fundamental.
Regards - Glen
In the meantime, in case this is enoguh to answer your question, here's an HTTP view of the directory:
Index of Web on datastore DiskPool in datacenter ha-datacenter
Name Last modified Size-----
Parent Directory -
Web-000001-delta.vmdk 12-Feb-2010 03:14 2113947648
Web-000001.vmdk 12-Feb-2010 02:41 241
Web-93b0ea0b.vswp 31-Dec-1969 23:59 -1
Web-Snapshot1.vmsn 12-Oct-2009 17:54 273786746
Web-flat.vmdk 12-Oct-2009 17:53 8589934592
Web.nvram 12-Feb-2010 03:14 8684
Web.vmdk 12-Oct-2009 03:53 396
Web.vmsd 12-Oct-2009 17:54 482
Web.vmx 12-Feb-2010 16:13 2447
Web.vmxf 12-Feb-2010 03:19 258
Web_1-000001-delta.vmdk 12-Feb-2010 03:14 2583709696
Web_1-000001.vmdk 12-Feb-2010 02:41 245
Web_1-flat.vmdk 12-Oct-2009 17:52 8589934592
Web_1.vmdk 12-Oct-2009 03:53 398
vmware-45.log 12-Feb-2010 04:00 16074
vmware-46.log 12-Feb-2010 04:48 16092
vmware-47.log 12-Feb-2010 05:25 16090
vmware-48.log 12-Feb-2010 05:33 16090
vmware-49.log 12-Feb-2010 05:34 16090
vmware-50.log 12-Feb-2010 05:36 16090
vmware.log 12-Feb-2010 16:13 15963-----
Regards - Glen