I have a host that has been running good for about a year. The other day it has started to randomly reboot, sometimes every ten minutes sometimes 24 hours. I believe it is hardware related. What logs should I be looking into?
Hello makman26,
Hosts don't just reboot themselves - either it is crashing and/or something/someone is restarting it (e.g. HPE AMS or similar other vendor utility).
First point of call would be to validate if there is a core dump which should be present in /var/core/ (provided you have core dumping configured, there is sufficient space to do so and enough time was given for it to dump before restart).
If there is a dump in /var/core/ then please open a Support Request with my colleagues in GSS.
Bob
Welcome to the Community,
in case of hard reboots - i.e. no PSOD - you should check the server/hardware logs, e.g. iLO, iDRAC, or other server management logs.
André
Hello makman26,
Hosts don't just reboot themselves - either it is crashing and/or something/someone is restarting it (e.g. HPE AMS or similar other vendor utility).
First point of call would be to validate if there is a core dump which should be present in /var/core/ (provided you have core dumping configured, there is sufficient space to do so and enough time was given for it to dump before restart).
If there is a dump in /var/core/ then please open a Support Request with my colleagues in GSS.
Bob
I found the problem over the weekend. It was the UPS shutting on and off but I did not notice it until the UPS starting sounding an alarm Sunday afternoon.
TheBobkin so you were correct it was something else.
Thank you for the responses.
Thanks for reminding me of the factor of electricity that the servers need more than anything else really, it's been so long since I have seen it on a case (other than obvious outages).
It's hard to PSOD with no power
Bob