russjar
Enthusiast
Enthusiast

Unexpected reboot

Hi all,

One of our ESX 3.5 hosts reboot itself unexpectedly and I was looking for a little help in terms of what to look for to try and find the reason why. Being a windows person I'm not exactly familiar with Linux logs and which ones to check...

Thanks in advance

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3
0 Kudos
12 Replies
sbeaver
Leadership
Leadership

A very high level place to start is on that ESX host in the /var/log folder. You can find all the logs to check out. In my experience when a machine just reboots out of the blue usually it is 1 or 2 things.

1. Someone rebooted the server by acident

2. You have had some kind of hardware error and I would start with memory first.

If you have any hardware agents like IBM Director or HP SIM ect take a look and see if you see any alerts or problems and look at the physical server for any yellow light that point to a problem

Steve Beaver

VMware Communities User Moderator

====

Co-Author of "VMware ESX Essentials in the Virtual Data Center"

(ISBN:1420070274) from Auerbach

*Virtualization is a journey, not a project.*

Steve Beaver
VMware Communities User Moderator
VMware vExpert 2009 - 2020
VMware NSX vExpert - 2019 - 2020
====
Co-Author of "VMware ESX Essentials in the Virtual Data Center"
(ISBN:1420070274) from Auerbach
Come check out my blog: [www.virtualizationpractice.com/blog|http://www.virtualizationpractice.com/blog/]
Come follow me on twitter http://www.twitter.com/sbeaver

**The Cloud is a journey, not a project.**
0 Kudos
russjar
Enthusiast
Enthusiast

Thanks sbeaver for taking the time to respond, but it still doesn't help me. I know the log files are in /var/log, but which log files do I need to look at as there is a plethora of logs. Also there are no agents installed so I can discount that right off the bat...

Thanks again

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3
0 Kudos
mcowger
Immortal
Immortal

Have to agree with steve here.






--Matt

--Matt VCDX #52 blog.cowger.us
0 Kudos
russjar
Enthusiast
Enthusiast

Hi I found these entries in /var/log/messages if someone with a lottel more experience in interpriting these log files wishes to have a gander...see attached file

thanks in advance

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3
0 Kudos
sbeaver
Leadership
Leadership

Also take a look at the vmkernel logs also

Steve Beaver
VMware Communities User Moderator
VMware vExpert 2009 - 2020
VMware NSX vExpert - 2019 - 2020
====
Co-Author of "VMware ESX Essentials in the Virtual Data Center"
(ISBN:1420070274) from Auerbach
Come check out my blog: [www.virtualizationpractice.com/blog|http://www.virtualizationpractice.com/blog/]
Come follow me on twitter http://www.twitter.com/sbeaver

**The Cloud is a journey, not a project.**
russjar
Enthusiast
Enthusiast

Yeah had a look at the vmkernel logs and conspicuously there are entries missing around that time of the event?

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3
0 Kudos
sbeaver
Leadership
Leadership

That makes me think hardware issue

Steve Beaver
VMware Communities User Moderator
VMware vExpert 2009 - 2020
VMware NSX vExpert - 2019 - 2020
====
Co-Author of "VMware ESX Essentials in the Virtual Data Center"
(ISBN:1420070274) from Auerbach
Come check out my blog: [www.virtualizationpractice.com/blog|http://www.virtualizationpractice.com/blog/]
Come follow me on twitter http://www.twitter.com/sbeaver

**The Cloud is a journey, not a project.**
0 Kudos
russjar
Enthusiast
Enthusiast

OK then thanks for your help appreciate it.

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3
0 Kudos
mcowger
Immortal
Immortal

Dont agree here.

This entry:

Dec 12 09:26:28 eduvm108 shutdown: shutting down for system halt

Dec 12 09:26:28 eduvm108 init: Switching to runlevel: 0

This is a controlled reboot request. Someone in your organization requested a controlled reboot of this host.






--Matt

--Matt VCDX #52 blog.cowger.us
0 Kudos
russjar
Enthusiast
Enthusiast

Well this is a bit frightening indeed! Because the root password is tightly controlled and only a few people should know it.

That would explain the "...root...127.0.0.1" which would mean some one had done it from ILO/RSA/console..?

May have to look at changing the root password.

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3
0 Kudos
mcowger
Immortal
Immortal

Dec 12 09:26:50 eduvm108 vmware-hostd[2088]: Accepted password for user root from 127.0.0.1

^^^ that bit is normal on an ESX host when hostd is doing stuff.

It could have been initiated from the console or via the VIC.






--Matt

--Matt VCDX #52 blog.cowger.us
0 Kudos
russjar
Enthusiast
Enthusiast

Thanks to the both of you for your help, I think I will be recommending a password change for our hosts

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3

VCP,MCSE NT4/W2k/W2k3, MCSA W2k3
0 Kudos