So for some reason i have warnings on my 4 ESXi hosts.
3 of which when i go alarms or warnings or hardware status i dont see anything. But they all have the yellow exclamation warning sign on the HOSTs. We have a 90 Day Evaluation license which has 60 days left. They are 3 Proliant Gen 8 and 1 Gen 7 blades that host ESXi. ESXi and vSphere are all 5.5.
But for right now the Gen 8 that has something in the Hardware Status monitor is driving me crazy. I try to go to the HP onboard Administrator and i do health checks and hardwarde tests and it shows everything is good and all lights are Green (showing perfecrt health).
Please see screenshot below and let me know if you could help me out.
Exaclty the same problem here Gents.
HP Blades 460c Gen8 , with latest SPP (February 2014) and ESXI 5.5u1 .
Onboard administrator 4.21 not reporting any errors.
i have tried reseting the sensors but nothing happens
vmware pls advice
Try to restart the host, have seen this a few times and a reboot have solved it.
Well I had HP support USA connect to my blades yesterday and restarted almost all the watchdogs , vpxa ,, hpsum and agents and the notification went away (that and one more blade with another sensor reporting issue).
I guess restarting the blade would probably have the same effect but we avoided having to vmotion the VMs to other blades and crowd them (especially avoided the ones using RDMs)
anyway looks ok for now , hope the notification never come back.
We are experiencing the same issues across multiple clusters and data centers. Have put in a call with VMware & HP support and so far have only been told to try the ~ # /etc/init.d/sfcbd-watchdog restart that Marcus mentioned by VMware. Although this took away the alert, it was only temporary. HP ran hardware tests and found that there were no issues with the hardware and had no other offerings by means of solution as they pointed the blame back at VMware. We are receiving not only the same temperature issues, but are also receiving issues about Storage, Logs and System Chassis 3 Enclsoure Asserts. We have reopened the case with VMware and will update if we get anywhere past this. Has anyone else had any success? This is pretty ridiculous.
We are experiencing the same issue on our blade infrastructure with HP firmware 4.01, even after update ESXi, 5.5.0, 1892794.
This issue is not occurring on our “older” blades with HP firmware 3.71.
Try this #localcli hardware ipmi sel clear. my techdirt: false GPU overhearting warning on esxi 5.5
Or in the vSphere client you can choose host, open Configuration->Security Profile->click Properties... link in the Services section and restart CIM Server. You can try to Update information in the Hardware Status pane, but it will be much faster logout of the client and login again.
Or open ssh session and run command /etc/init.d/sfcbd-watchdog restart.
It worked for me.
I am also facing same problem with multiple ESXi 5.5.0 update 1746018 and all system are installed HP Proliant BL460c Gen8 Blade. Clearing IPMI logs by localcli hardware ipmi sel clear or resetting the sensor is not helping here any more. I have found this might be a fales alert can be possible with above Hardware and OS combination:
I understand restarting watchdog/management service or ESXi host might fixed it temporarily but this is not right approach for any production environment.
Can somebody puts some lights on it ??
I updated all my hosts to from 5.5(no update) to 5.5 - Update 2 (HP image) about a week ago and have not seen any error messages after that. - and I had a LOT of error messages.
We are running HP BL460c Gen 8 with a fresh install of HP ESXi 5.5 U2 (build 2068190). A service restart of Watchdog (CIM Server) resolved the alarm.