A week ago I was unable to start my VM's
The problem was that my 2 ESX server couldn't reach the licence server.
This took me a half hour to find the problem and fix it. In the meantime nobody could reach the VM servers. Is there a option to get an alert message ( email ) when the ESX servers can't reach the licence server?
I didnt see any possiblities to do this,
but all te License Log will be written ti /var/log/vmware/hostd.log
you can grep with a script to the pattern if license dindt found and send and email.
Nautilus
The system is supposed to cache licensing for 2 weeks I believe. That way if the connection to the license server is lost you can still manage the servers and the VMs running on it.
What exactly was the issue? Was the licensing service stopped or was the server offline or firewalled accidentally? We use Solarwinds Orion monitoring and there is an add-on for it that can monitor individual services and applications to verify they are up and running. I imagine other monitoring solutions have similar options.
Th license server was working, only we use a non standard port number for the license check. This port was closed bij VMWare with one of the updates. So the ESX server was running fine for 2 weeks. After that period the problems started and we found out of the closed port. To be sure that in the future nothing similar will happen, we need to monitor this.
Ah and ouch ...
The only real suggestion I would have at this point would be to check each host with the VC Client, under Configuration -> Security Profile[/b] (probably in conjunction with some editing on the ESX server to show the appropriate port there) after upgrades to verify that port is still open. Though, I would think if you have it explicitly open there it wouldn't overwrite those changes.
The other option would be as mentioned above. Get a cron job set up to grep the file for problems and fire off an e-mail if it happens.
Look at the eG VM Monitor - http://www.eginnovations.com/web/virtualization-monitoring.htm