rnwaterhouse
Contributor
Contributor

Two ESXi Servers Spontaneously Reboot Within Seconds of Each Other

Hi Everyone;

I have two ESXi hosts. One is an older Lenovo ThinkServer running the free version of ESXi 5.1 (Yes I am aware how old that is). The other is a brand new Dell R640 running the current version of Essentials. Both are connected to an Eaton inline UPS.  I am in the process of moving the guests from the old server to the new one. Last Thursday morning in the wee small hours they both rebooted within seconds of each other. There were no power events logged by the UPS, and other equipment (Cisco switch's, WatchGuard firewall) did not reboot. The Dell has redundant power supplies, one of which failed after the reboot. Dell support was contacted and the PS is being replaced. The only indication from the Dell hardware logs is a message "System CPU Resetting" "System is performing a CPU reset because of system power off, power on or a warm reset like Ctrl-Alt-Del". As I stated there was no power event, and the unit is in a locked server room. Dell support also pointed out that there was a critical BIOS update outstanding, which has been applied. 

I can just barely accept that some kind of BIOS bug or a fault with the PS caused the Dell to reboot. What I cannot fathom is why the Lenovo host rebooted at almost the exact same time. Any suggestions would be welcome, as I am losing sleep over this.

 

0 Kudos
4 Replies
mbufkin
Enthusiast
Enthusiast

Both servers are on the same UPS and both power supplies also are on the same UPS? It honestly sounds like the issue is the UPS. Could it be overloaded? I'm just trying to gather more information. Two servers rebooting at the same time sounds strange unless it's the UPS.

0 Kudos
e_espinel
Commander
Commander

Hello.
That two servers of different brands, reboot almost at the same time is extremely strange.


The first thing to do would be to perform a complete check of the UPS and the entire electrical circuit.
Certain voltage variations that are not detected by the UPS can cause failures or inivisions in the power supplies of the equipment.

If no problem is found in the UPS, an option is to place a power monitoring device to rule out electrical problems.

 

 

Enrique Espinel
Senior Technical Support IBM, Lenovo, VMware vSphere and Veeam Backup.
VMware VSP-SV, VTSP-SV, VTSP-HCI, VTSP 5
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
0 Kudos
rnwaterhouse
Contributor
Contributor

Yes they are and I don't disagree with your logic. This was also my first thought. However, this is a pretty high end and new Eaton inline UPS that has well over 30 minutes of run time with this load. It saw nothing. Additionally all the other equipment that is also connected to this UPS did not power cycle.

0 Kudos
rnwaterhouse
Contributor
Contributor

Thanks for the reply. I will contact Eaton tech support and see what they have to say. I also like the idea of monitoring the UPS output and will make that happen.

0 Kudos