VMware Cloud Community
eitght
Contributor
Contributor

ESXi 5.5 unexpected reboots

Hello, today my team reported that one of our PE1950 servers with ESXi 5.5 started to reboot unexpectedly 4 times, and after the 5th boot, it stopped rebooting.

I got the following summary of logs from the vmware-support tar file , from today (Nov212016 UTC -5):

1) vmwarning:

2016-11-21T17:35:24.738Z cpu0:33294)WARNING: LinuxSignal: 538: ignored unexpected signal flags 0x2 (sig 17)

2016-11-21T17:35:27.152Z cpu1:33176)WARNING: Team.etherswitch: TeamES_Activate:668: Failed to initialize beaconing on portset 'pps': Not implemented.

2016-11-21T17:35:41.264Z cpu1:33416)WARNING: ScsiScan: 1408: Failed to add path vmhba1:C0:T0:L0 : Not found

2016-11-21T17:35:41.268Z cpu1:33416)WARNING: ScsiScan: 1408: Failed to add path vmhba1:C0:T1:L0 : Not found

2016-11-21T17:35:45.708Z cpu3:33176)WARNING: NetDVS: 547: portAlias is NULL

2) vmkernel:

0:00:00:05.226 cpu0:32768)Device: 220: Registered driver 'chardevlayer' from 0

0:00:00:05.227 cpu0:32768)Init: 881: Vmkernel initialization done.

0:00:00:05.227 cpu0:32768)VMKernel loaded successfully.

2016-11-21T20:32:14.139Z cpu1:32791)ScsiCore: 130: Starting taskMgmt watchdog world 32791

2016-11-21T20:32:14.139Z cpu2:32792)ScsiCore: 130: Starting taskMgmt watchdog world 32792

2016-11-21T20:32:14.140Z cpu3:32938)ScsiCore: 63: Starting taskmgmt handler world 32938/1

2016-11-21T20:32:14.140Z cpu3:32879)VSCSI: 2606: Starting reset handler world 32879/1

2016-11-21T20:32:14.140Z cpu2:32880)VSCSI: 2800: Starting reset watchdog

3)syslog:

2016-11-21T21:17:59Z localcli: IpmiIfcSensorGetReading: Skipping System Software sensor 0x7

2016-11-21T21:17:59Z localcli: IpmiIfcSensorGetReading: Skipping System Software sensor 0x6

2016-11-21T21:17:59Z localcli: IpmiIfcSensorGetReading: Skipping System Software sensor 0x5

2016-11-21T21:17:59Z localcli: IpmiIfcSensorGetReading: Skipping System Software sensor 0x4

Searching in google, i suspected that the issue could be related to this bug: ESXi 5.5 host server reboots unexpectedly followed by Uncorrectable Machine Check Exception error (2..., however, I didnt find the specific UMCE error. So, I dont know which could be the cause of this behavior.

I also suspect that the server could shutdown because of high temperature on room, or error 1215 on server screen, but this error only refers to a voltage level.

Any idea of this behavior' root cause  is greatly appreciated, since often the esxi logs are not conclusive.

Regards.

Tags (1)
Reply
0 Kudos
0 Replies