I have experienced a problem with the Virtual Center server. Sometimes all my VMs are shutdown and all that I can se is that the VC service has shutdown. To day the VC server connectivit (link) and after the link came up again the VC service had shutdown and also all my VMs (on my 4 ESX servers) The VC serviec has no problem starting up again and then I can start all the VMs, but I can´t really see why the VMs should shutdown when the VC service shutsdown? (I have the Licenseserver on the same machine)
VC going down should not in any way cause your VM's to be shutdown. What version of ESX are you running? Are you using auto startups/shutdowns? Can you ping your VM's when VC is down? You should be able to since they are running on ESX and not dependent on VC. Anyone else have access to VC/ESX that may be shutting them down? I'd start withe the log file and see what they say.
How do I troubleshoot ESX server issues?
You can check several log files on the ESX server based on the problem you are experiencing, these include:
o Vmkernel - /var/log/vmkernel records activities related to the virtual machines and ESX server
o Vmkernel Warnings - /var/log/vmkwarning records activities with the virtual machines
o Vmkernel Summary - /var/log/vmksummary - Used to determine uptime and availability statistics for ESX Server; human-readable summary found in /var/log/vmksummary.txt
o ESX Server host agent log - /var/log/vmware/hostd.log - Contains information on the agent that manages and configures the ESX Server host and its virtual machines (Search the file date/time stamps to find the log file it is currently outputting to.)
o Service Console - /var/log/messages - Contain all general log messages used to troubleshoot virtual machines or ESX Server
o Web Access - /var/log/vmware/webAccess - Records information on Web-based access to ESX Server
o Authentication log - /var/log/secure - Contains records of connections that require authentication, such as VMware daemons and actions initiated by the xinetd daemon.
o VirtualCenter agent - /var/log/vmware/vpx - Contains information on the agent that communicates with VirtualCenter
o Virtual Machines - The same directory as the affected virtual machines configuration files; named vmware.log - Contain information when a virtual machine crashes or ends abnormally
How do I troubleshoot VirtualCenter server issues?
o To troubleshoot Virtual Center problems click on the Admin button in Virtual Center and then click the System Logs tab
o To view the log files directly open Windows Explorer and go to the C:\Windows\Temp\VPX directory on the Virtual Center server
I have had these problems for a couple of versions. Currently I´m running esx 3.0.1 32039 and VC 2.0.1 33643. I both tried to access my VMs through SSH, ping, web but they where dead. I also logged on to the esx machines (using the web-console) and all the machines had been powered down. I´m not using auto shutdowns and there´s only two administrators of the systems (me and my partner) and none of us have been powering VMs down.
I´m also running HA and DRS so i thought that would help but no..
The only thing I can se in the VC log is:
SSLVerifyCertAgainSystemStore: Subject mismatch: esx3.dnlab.se vs 126.96.36.199
SSLVerifyCertAgainSystemStore: The Remote Host certificate has these problems:
SSLVerifyIsEnabled: failed to read registry value. Assuming verification is disabled. LastError=0
SSLVerifyCertAgainSystemStore: Certificate verification is disabled so connection will proceed desptie...
There is a 3.0.1 bug that will power off all your VM's if the mgmt-vmware service is restarted, this only happened if you have auto startups enabled though. Those errors in the VC log will not casue that behavior. I'd start by patching VC to 2.0.2 and ESX 3.0.2 and see if the problem still happens.