I am looking to disable a false-positive alarm on a single host.
I have one host with a constant Alarm state, on a "Hardware Fan" alert for a non-existent fan!
The server itself has 12 fan slots, but only 5 populated. For some reason, vCenter is alerting on the "missing" fans.
The Alarm is defined at the vCenter level, so I can't disable it on the specific host - nor do I want to disable the entire alarm since I'd then miss an alarm if one of the existing fans really does fail.
Any thoughts on who to address it?
The host in question is an Intel S2600GZ, and the RMM on the host itself is *NOT* complaining of any issues (i.e., it's not just VMware passing on an alarm from the host hardware).
Not sure of the following - so assuming ESX/ESXi builds etc but need to know
Try the following and see what works
I can clear the alarm and it comes back - not right away, but it always does.
This host has been reformatted and had BIOS updates, and even moved from one vCenter to another vCenter.
This pretty much rules out a configuration issue. It is to do with the hardware that is incorrectly reporting fan state to the ESX layer. On the board documentation, is there a requirement that says all fan slots are to be populated? If not, is there a jumper setting or something that needs to be changed? If not, the next logical course of action would be to check with the hardware vendor to see if this is normal behaviour if all fan slots are not being used.
I would suggest check with hardware vendor if they have any fix for it.
I have seen issue with ibm hs22v where esxi reports incorrect bios battery voltage.
I will check with Intel, but as I mentioned, the board monitors are "all green" when I look at the RMM using the web interface.
I can see in the RMM which fans are present and which are not, and those in the system are all spinning at the same RPM.
Adding to the confusion, fans #1-6 all have a fan-A and fan-B, and all 5 existing fans have only fan-A present. So again, I am confused as to why VMware would only complain about fan 5, when 1-4 are exactly the same (I can imagine it complaining the fan-B missing on all 5 slots - that would make sense).
ESX essentially takes hardware information from CIM providers that in turn pull up information from the BMC. So it is either the MB/sensors/BMC/CIM providers - in that order you would need to isolate
I don't see any one else using the same hardware having run into this with ESX - and haven't seen related discussions on the Intel forums - ESX or otherwise which again points to something in the specific device/hardware
You could check the diag lights on the system board to see if that gives you a clue
The answer to this problem should be simple. Disable the fan alarm for this particular host.
Since it is not getting detected correctly, the solution is to disable the alarm, or your host will have the "Check Engine Light" on at all times, blinding the administrator to real issues that may have cropped up outside of the bogus fan alert.
There is no way I see how this can be done for a single host, although I see you can do it at the vcenter level. The attached screen shot shows you how you can edit, and then disable the alarm. It would probably not be wise to disable this particular alarm for all the hosts in the vcenter.
In our case we have a bad sensor, so I have to live with the check engine light, unfortunately. I am guessing I could probably right a script to constantly look for and disable the warning.