VMware Cloud Community
Sware07
Contributor
Contributor

Vmware ESXi Server reboot

I have a server (HP ProLiant DL380 Gen9, Intel Xeion CPU E5-2609v3 with 2.04 TB) running VMWare ESXi 5.5.0 with 2 VMs. I've been having issue of sudden reboot every once in a while. I've gone through VMware Knowledge Base (Determining why an ESXi/ESX host was powered off or restarted (1019238)) but nothing comes up.

I guess my question is - can I get some sort of hint in the vmksummary or vmkernel log files to find the issues ? I've gone through those files but nothing pops-out of the ordinary before the reboot happens.

What are the places to look in the log files ?

Tags (1)
18 Replies
daphnissov
Immortal
Immortal

What build number of 5.5 are you running?

Reply
0 Kudos
Sware07
Contributor
Contributor

Build No. 2403361

Reply
0 Kudos
daphnissov
Immortal
Immortal

You are 3 years behind on patches with that build. Update to the latest available build first before continuing troubleshooting efforts.

Reply
0 Kudos
msripada
Virtuoso
Virtuoso

can you attach the hostd and vmksummary logs?

Thanks,

MS

Reply
0 Kudos
Sware07
Contributor
Contributor

I do need to update to latest build version. Meanwhile I'm just poking around to see if I can find something.

I've attached hostd.log and vmksummary.log.

Thanks

Reply
0 Kudos
a_p_
Leadership
Leadership

Login to the host's iLO interface, and check the logs there, to find out whether they contain entries related to the reboots.

André

Reply
0 Kudos
msripada
Virtuoso
Virtuoso

2018-01-29T16:00:01Z heartbeat: up 0d5h22m12s, 2 VMs; [[33912 hostd-worker 33948kB] [85466 vmx 1841152kB] [85453 vmx 4079044kB]] [[35267 sfcb-smx 6%max] [35325 sfcb-vmware_bas 13%max] [35319 sfcb-pycim 22%max]]

2018-01-29T16:34:37Z bootstop: Host has booted

from the vmksummary, it shows that the host is booted and coredumped as per the logs.. This indicates a hardware specific nature of issue or an ASR

If it is user initiated then it should list host is booting first and then host has booted

Thanks,

MS

Reply
0 Kudos
Sware07
Contributor
Contributor

Thanks a lot...and I guess to find out the exact hardware issue - go through the IML logs ?

Reply
0 Kudos
msripada
Virtuoso
Virtuoso

Try this command on ESXi host esxcli hardware ipmi sel list and share the output.. if its blank or no valid output, needs to check the ilo logs

Thanks,

MS

Reply
0 Kudos
Sware07
Contributor
Contributor

Yeah...there's one record from 2015. Nothing else.

Reply
0 Kudos
msripada
Virtuoso
Virtuoso

That is not valid, you need to look into the SEL logs from the ilo or any hardware events on the ilo

Thanks,

MS

Reply
0 Kudos
Sware07
Contributor
Contributor

I only have access to the server through putty ....can I get that SEL logs from the command line ? Sorry but quite new to all of this. Thanks a lot for the help.

Reply
0 Kudos
msripada
Virtuoso
Virtuoso

if its HP hardware, i missed on that earlier to ask.. you can try the commands .. if not you need to find similar documentation for respective hardware vendor

SHOW LOG ILO

Thanks,

MS

Reply
0 Kudos
Sware07
Contributor
Contributor

it's a HP hardware...when I try that command..says "show" command not found .

Reply
0 Kudos
msripada
Virtuoso
Virtuoso

I would like to check where you have taken the ssh session? That command would not work on ESXi host.. It is for hp hardware

Thanks,

MS

Reply
0 Kudos
Sware07
Contributor
Contributor

Can you please, guide me through how would I do it ? Thanks.

Reply
0 Kudos
sandu
Enthusiast
Enthusiast

Hi,

Do you see the problem even when you have both PSUs connected?

I had a similar problem (faulty hardware) and issue was resolved when both the PSUs were connected.

Thanks

Sandeep

Reply
0 Kudos
Sware07
Contributor
Contributor

Hi Sandeep, yes both PSU are connected .

Reply
0 Kudos