VMware Cloud Community
peposimo
Contributor
Contributor
Jump to solution

vSphere 4.1 U3 reboots

Hello,

I have 2 Supermicro servers with vSphere 4.1 update 3 build 800380 installed with vmware-esx-drivers-net-ixgbe_400.3.9.13-1vmw.2.17.249663.697530 injected. The iso file was customized by ESXi-Customizer-v2.7.

The servers works very well except one thing. They reboots by themseves randomly. Let say one time a week. I was looking on the logs for any messages by I don't see anything usefull for me.

The file messages shows:

Feb  1 05:47:41 sfcb-CIMXML-Processor[1183833]: -#- HTTP header command is not expected
Feb  1 05:56:16 vmklogger: Successfully daemonized. - time when ESXi host start booting

Can anybody point me into right direction where do I have to dig, which logs I have to check to find the reason of crashes?

I have a bunch of logs, of course hard to read, but I would like to know which one shows the state of ESXi at that point of time. I mean not the logs between vCenter and ESXi host, but any logs like purple screen, driver crashes or hardware failure.

Thank you.

Tags (1)
0 Kudos
1 Solution

Accepted Solutions
zXi_Gamer
Virtuoso
Virtuoso
Jump to solution

Like mentioned earlier, ESX would psod in case of PFs or cpu locks but wont restart. Need say, that none of the agents in ESXi would restart the server itself, which could makes us shift the view to the hardware itself excluding the intrusion of a person hard resetting the server.

Check in the bios/management software for any alerts of Temperature, memory failed modules et al.

Also, in vmkernel logs [messages in ESXi 4.1], check for "Booted successfully." which is the start time of the kernel.

Again, if the server is bought down by hard methods, or hardware faults by bios, it cant be logged in vmkernel.

View solution in original post

0 Kudos
5 Replies
a_p_
Leadership
Leadership
Jump to solution

Do these hosts have a management interface to access server (hardware) log files? ESXi usually stops with a PSOD in case it detects any issue and does not reboot automatically.

André

0 Kudos
LucD
Leadership
Leadership
Jump to solution

Thread moved to the ESXi 4 community.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
peposimo
Contributor
Contributor
Jump to solution

Yes, they do, but I was looking into messages, hostd and vpxd files located on ESXi host and a point of time, after I see connection lost in vSphere Client (Task&Events) and I see services starting and other info recording into log files but I dont's see any failure messages before crush.

It looks like somebody makes a hard reset on the server.

0 Kudos
zXi_Gamer
Virtuoso
Virtuoso
Jump to solution

Like mentioned earlier, ESX would psod in case of PFs or cpu locks but wont restart. Need say, that none of the agents in ESXi would restart the server itself, which could makes us shift the view to the hardware itself excluding the intrusion of a person hard resetting the server.

Check in the bios/management software for any alerts of Temperature, memory failed modules et al.

Also, in vmkernel logs [messages in ESXi 4.1], check for "Booted successfully." which is the start time of the kernel.

Again, if the server is bought down by hard methods, or hardware faults by bios, it cant be logged in vmkernel.

0 Kudos
peposimo
Contributor
Contributor
Jump to solution

So...

IN vSphere 4.1u3 on DCUI I did a "Reset to Factory Defaults" which removed 2 10Gbit network card from the system and restored the system to clean state. The reconfigured the system, configured vSSes to 2 available 1Gbit network cards. Then I restored to Inventory VM located in local datastore. I have installed a Nework stress test software where 4 clients make 100 connections (random size packets, from 1Kb to 10Mb with 10ms frequency) from one pNIC to a Windows Server 2008 connected to another pNIC. The host works more than one week without any restart.

So… What does it mean? Server makes self-reboot if he has troubles with physical network card.

Why this conclusion? Simple… I have a HP Microserver N36 at home with installed Intel PRO/1000 network card (I have no idea if this card has no issues). And… the server works well, except moments when I have traffic on that network card. Every time when I have traffic, server reboots by himself.

0 Kudos