VMware Cloud Community
dimsys
Contributor
Contributor

VMs not powering on after ESX restart - possible license server issue?

Hi all,

Our co-lo recently experienced a power outage that extended beyond what our UPS's could handle. I have two ESX 3.01 servers, each with direct attach storage.

When ESX came back up upon restore of power, none of the VMs came up. If I recall, they had an ! on the icon in VC.

My VC and license server are on a small Windows VM on one of the ESX boxes. I went into VI client on that ESX box directly and restarted that VM. Then, I was able to login to VC and restart the rest of the VMs.

Would I be better off based on my config (no vmotion capable right now) making my license server/VC on a physical box? I have a Windows machine down there I use as a utility machine for backups that I was thinking might work. Pretty small environment (only about 5 VM's on each ESX box).

ALSO-- I was doing maintenance on one of the Redhat (RHAS 4 Update 3) VMs the other night. It was having time drift issues so I made some changes and restarted the VM. When I tried to power it back on, it said "The operation is not allowed in the current state". I tried a couple more times with the same error and began getting the slightly warm panic feeling Smiley Wink Thanks to a message here, I ssh'd to the ESX box that machine was on and issued 'service mgmt-vmware restart' and then tried to restart it and it worked. I am wondering if this could also be related to some kind of lack of communication with the license server?

Thanks for any feedback...

0 Kudos
2 Replies
enorthcraft
Enthusiast
Enthusiast

Hey dimsys,

Sound like you co-lo needs to make some investments! If you need a data center with some redundancies, give me a ring Smiley Happy

Seriously though, you are in a bit of a pickle hosting your VC as a VM. I realize that it's supported by VMWare, but it really needs to be "on the outside looking in". Budgets don't always allow that though, so at least make sure that your VC is set to auto start when ESX powers up. By default machines are not set to power on at restart.

I would absolutely recommend moving VC to your standalone box. With your small environment, you shouldn't need much of a box. Make sure that you use the MSDE local to the box or another dedicated standalone SQL server for the db. You don't want to point VC back into your VM environment for db access.

The license server is your call. It is easier initially to install the licenses on each machine, but the license server makes the overall management easier. They've made some changes to the license server process that may make it easier to set up than when I had to do it at the first release.

service mgmt-vmware restart is your friend. Learn to love it. I find myself using more than I think I should have to. I imagine the fact that your VC is hosted might be causing some of your problem. Time skew can really affect an OS although I've never seen it cause this particular issue. I would recommend that you make ntp configuration part of your standard build process. VMWare tools will help keep the clock rate synced.

Your issue shouldn't have anything to do with a license server. Hosts can "survive" for several days (I think it is seven) without communicating with a license server.

Hope this helps.

Eric Northcraft enorthcraft at gmail.com
0 Kudos
jonhutchings
Hot Shot
Hot Shot

To address your second question first - the need to restart your managment agent, doesn't usually indicate problems with contacting the licence server more an issue with the link between VC and ESX . To be honest it can, at times be a bit flaky anyway, a search of these forums will reveal that restarting the management agent "fixes" all sorts of issues. In general the issues are one of communications between VC server, client and ESX server not keeping in sync - for example one issue you may see is that operation complete on the ESX server but they will hang in VC. Often a resart of the management agent is the only way to clear this.

I don't think that having a physical server will necessarily improve this, after all you may not get the same problem again, or only rarely. If of course you consistently get the problem then there is something up Smiley Happy

As for starting up, without HA I'm not sure what your options will be ( I \*think* ) all the power on policy is tied to HA. You could have a physical server, configured to power on when the AC returns, but if this is worth it or not (another physical box to host/power/cool manage etc.) depends on how often you are likely to get these kind of outages.

0 Kudos