VMware Cloud Community
athemiya1
Contributor
Contributor

ESXi host reboot without permission

Ladies and Gents,

Came in to the office one morning a few days ago to see that VCenter was not responding and a member of staff complaining they couldn't sign onto their Horizon View desktop. On closer inspection, I noticed that Horizon View admin reported that VCenter was down and I confirmed this as logging onto VCenter using Vsphere returned errors as it was not found.

On closer inspection some moments later, I eventually got into vCenter and noticed that one of our three hosts had commenced vMotion across all of its VM's (thank goodness!) and moved them to another host that was fine and operational. It turns out from inspecting the logs that this host had actually rebooted without any authorisation or command! Take a look at this:

2013-08-16T06:00:02Z heartbeat: up 120d15h17m50s, 8 VMs; [[5877137 vmx 3145728kB] [19382 vmx 9356556kB] [20834 vmx 10485312kB]] [[17111 lsassd 10%max] [17190 netlogond 10%max] [18365 sfcb-pycim 20%max]]

2013-08-16T07:00:01Z heartbeat: up 120d16h17m49s, 8 VMs; [[5877137 vmx 3145728kB] [19382 vmx 9356400kB] [20834 vmx 10485232kB]] [[17111 lsassd 10%max] [17190 netlogond 10%max] [18365 sfcb-pycim 20%max]]

2013-08-16T08:00:02Z heartbeat: up 120d17h17m49s, 8 VMs; [[5877137 vmx 3145728kB] [19382 vmx 9356576kB] [20834 vmx 10485240kB]] [[17111 lsassd 10%max] [17190 netlogond 10%max] [18365 sfcb-pycim 20%max]]

2013-08-16T08:48:04Z bootstop: Host has booted

2013-08-16T09:00:02Z heartbeat: up 0d0h14m24s, 0 VMs; [[18433 sfcb-pycim 15108kB] [17808 vpxa-worker 18648kB] [17504 hostd-worker 56848kB]] [[17185 netlogond 8%max] [18439 sfcb-vmware_bas 9%max] [18433 sfcb-pycim 19%max]]

2013-08-16T10:00:02Z heartbeat: up 0d1h14m24s, 15 VMs; [[21757 vmx 4170248kB] [21741 vmx 7256728kB] [21738 vmx 10485032kB]] [[17185 netlogond 8%max] [18439 sfcb-vmware_bas 9%max] [18433 sfcb-pycim 19%max]]

2013-08-16T11:00:01Z heartbeat: up 0d2h14m23s, 15 VMs; [[21757 vmx 4170832kB] [21741 vmx 7257972kB] [21738 vmx 10484944kB]] [[17185 netlogond 8%max] [18439 sfcb-vmware_bas 9%max] [18433 sfcb-pycim 19%max]]

2013-08-16T12:00:02Z heartbeat: up 0d3h14m23s, 15 VMs; [[21757 vmx 4173592kB] [21741 vmx 7259312kB] [21738 vmx 10484296kB]] [[17185 netlogond 8%max] [18439 sfcb-vmware_bas 9%max] [18433 sfcb-pycim 19%max]]

2013-08-16T13:00:02Z heartbeat: up 0d4h14m23s, 15 VMs; [[21757 vmx 4173688kB] [21741 vmx 7259232kB] [21738 vmx 10484300kB]] [[17185 netlogond 8%max] [18439 sfcb-vmware_bas 9%max] [18433 sfcb-pycim 19%max]]

2013-08-16T14:00:01Z heartbeat: up 0d5h14m22s, 15 VMs; [[21757 vmx 4173388kB] [21741 vmx 7259212kB] [21738 vmx 10484248kB]] [[17185 netlogond 8%max] [18439 sfcb-vmware_bas 8%max] [18433 sfcb-pycim 19%max]]

I can confirm that the other numerous and relevant logs also show that it rebooted.

My question to my esteemed colleagues here is what would you advise specifically I look at next in order to find the root cause of this?

Thanks in advance!

Hesan

Tags (4)
0 Kudos
1 Reply
weinstein5
Immortal
Immortal

With the host rebooting there are two possibilities that could have caused this human error or equipment failure - I would investigate the logs to see if their any idications of hardware failures including the logs in vCenter -

When the host crashed VMware HA engaged restarting the VMs on the remaining hosts in the cluster -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos