VMware Cloud Community
bellocarico
Enthusiast
Enthusiast

ESXi 6.5 standalone rebooting automatically

I am finding my VMs running on a single standalone 6.5 host, with uptimes of less than 24 hours.

Essentially the host "boots" daily.

This started all of a sudden... is there any suggested approach in going troubleshooting this?

Can I find out if the host was "soft" rebooted or it crashed?

Anything else to keep in mind?

Thanks!

0 Kudos
4 Replies
scott28tt
VMware Employee
VMware Employee

Hardware diagnostic checks?


-------------------------------------------------------------------------------------------------------------------------------------------------------------

Although I am a VMware employee I contribute to VMware Communities voluntarily (ie. not in any official capacity)
VMware Training & Certification blog
0 Kudos
NathanosBlightc
Commander
Commander

Check the vmkernel.log & vmksummary.log files in the /var/log directory (via Shell or SSH access). If there is any related event about host safety reboot, it's happed normally.

But if you find the startup events suddenly after normal actions/operations, so there must be a crash event that is happened in your host. You can also configure coredump to collect dump information about PSOD events on your host. Check the following links to find out how to configure it:

Undercity of Virtualization: What is the VMKernel Core Dump - Part I

Undercity of Virtualization: VMKernel Core Dump - Part II: How to add & remove coredump file

Please mark my comment as the Correct Answer if this solution resolved your problem
0 Kudos
RickVerstegen
Expert
Expert

Check hostd.log and vmkernel.log.

Also execute this command to check if ESXi is configured to automatically reboot after a Purple Screen of Death (PSOD):

esxcfg-advcfg -g /Misc/BlueScreenTimeout

If the value listed is anything other than 0, then ESXi automatically reboots after the PSOD.

Was I helpful? Give a kudo for appreciation!
Blog: https://rickverstegen84.wordpress.com/
Twitter: https://twitter.com/verstegenrick
0 Kudos
bellocarico
Enthusiast
Enthusiast

I have attempted to check the logs but don't see anything relevant.

So booted off memtest86+ .... and thesServer reset while scanning.

I did a bit of reading around and few people suggested that it might not be necessarily the RAM causing the issue and to try to replace PSU first.

I haven't done this yet but as I have 40GB of RAM installed (5x8GB) I have currently reduced the total RAM allocation to VM to 12GB and things seem ok now (no reboot in the last 24 hours at least).

Assuming the issue is the RAM actually, I was wondering if memtest86+ can test one bank of memory at the time (so that I can spot which one is faulty)?

0 Kudos