VMware Cloud Community
Nikeshck
Contributor
Contributor

some issue regarding ESXi 5.1.0. its restarting automatically in between 10-15 days

hi, i am facing some issue regarding ESXi 5.1.0. its restarting automatically in between 10-15 days.There is some error is generated in logs.So could you please help me to find out is there is any issue in ESXi.

Logs attached here, ESXi installed on HP BL 680c server.

Tags (2)
7 Replies
Alistar
Expert
Expert

Hello,

I am sorry but this log doesn't tell absolutely anything - can you please post vmkernel.log and vmkwarning.log or ideally the whole vmsupport package ?

Stop by my blog if you'd like 🙂 I dabble in vSphere troubleshooting, PowerCLI scripting and NetApp storage - and I share my journeys at http://vmxp.wordpress.com/
0 Kudos
FritzBrause
Enthusiast
Enthusiast

An ESXi host is normally not restarted by itself.

If there is a problem, it PSODs (bug, HW fault) or hangs (for example memory leak).

Please follow http://kb.vmware.com/kb/1019238 and find out more details.


So a restart could happen from "outside" ESX, for example by HW monitoring (temperature), iLO maybe (I don't know).

The syslog.log you supplied does not contain any errors/warnings (there is only one line with error which is not related).

For the hp-ams bug (version 9.60), you would normally see "cannot allocate memory" errors in syslog.log, hostd.log, vmkernel.log and vmkwarning.log.

Check those logs for any issues before the restart.



Nikeshck
Contributor
Contributor

Hi,

thanks for your reply, find the attached logs.There is also a iLO event log generated the same time.

iLO 3","12/27/2014 16:26","12/27/2014 16:26","1","Server power loss caused by: voltage regulator. Attempt to restore server power in 8

But there is no hardware related error showing in iLO. I already verified all the hardware related logs with help of HP engineer, confirmed that no issue showing related hardware. 

0 Kudos
FritzBrause
Enthusiast
Enthusiast

Do a BIOS upgrade to latest available version.

And change the iLO "Power Regulator Settings" from "HP Dynamic Power Savings Mode" to "HP Static High Performance Mode".

0 Kudos
Nikeshck
Contributor
Contributor

Yes,already done,but same issue repeating.

0 Kudos
Alistar
Expert
Expert

Hi again,

unfortunately the logs are flooded with

2015-01-06T11:58:56.334Z cpu6:563375)WARNING: VMW_SATP_ALUA: satp_alua_getTargetPortInfo:91:Could not find target port group ID for path "vmhba5:C0:T0:L1" - Not found (195887107)

2015-01-06T11:58:56.334Z cpu6:563375)WARNING: NMP: nmp_SatpClaimPath:2093:SATP "VMW_SATP_ALUA" could not add  path "vmhba5:C0:T0:L1" for device "Unregistered Device". Error Not found

2015-01-06T11:58:56.334Z cpu6:563375)WARNING: NMP: nmp_DeviceAlloc:1228:nmp_AddPathToDevice failed Not found (195887107).

2015-01-06T11:58:56.334Z cpu6:563375)WARNING: NMP: nmp_DeviceAlloc:1237:Could not allocate NMP device.

and

2015-01-06T08:59:50.693Z cpu25:16409)WARNING: ScsiDeviceIO: 1211: Device naa.600508b1001cfe5b829392c4873ecc83 performance has deteriorated. I/O latency increased from average value of x microseconds to y microseconds.

so I suggest checking on your storage configuration - there is really nothing else besides these errors.

Anyways, if iLO reports there is a problem in the Voltage Regulator, this means a faulty motherboard. Voltage Regulator Module is essential for the CPU to maintain its integrity. We had several issues with HP in the past where they were unresponsive to proven hardware errors that their useless diagnostics software didn't reveal - chase them to exchange your motherboard and if possible get the higher management or your Technical Account Managers involved.

Stop by my blog if you'd like 🙂 I dabble in vSphere troubleshooting, PowerCLI scripting and NetApp storage - and I share my journeys at http://vmxp.wordpress.com/
0 Kudos
Nikeshck
Contributor
Contributor

Thank you for your valuable replySmiley Happy.

0 Kudos