VMware Cloud Community
andvm
Hot Shot
Hot Shot

Host hardware sensor state Errors

Hi,

In Events I am noticing multiple repeating Alarms/logs on different but identical servers and in different clusters such as:

type memory.Description. Memory Device...state deassert..

type voltage.Description Processor ..state deassert..

type systemBoard.Description System Board..state deassert..

type temperature.Description System Board...state deassert..

I am wondering if these are the cause/or part of it for why I am starting to get hosts disconnected from vCenter due to hostd become unresponsive?

ESXi is 6.7U3 on Dell Servers, iDRAC all green/Healthy

0 Kudos
3 Replies
bbalido9
Contributor
Contributor

Hi,

If you are getting alert related to hardware error then please engage with your hardware vendor to perform hardware diagnostics.

Additionally please check hostd.log for "IPMI SEL unavailable".

This can lead to hostd service being unresponsive hence disconnection state from vcenter.

Hope this helps.

Balido

0 Kudos
mk112
Contributor
Contributor

We have the same issue on some of our Dell R740 servers.

Host_Hardware_sensor_state.jpg

As you see in the screenshot the "Host hardware sensor state" alarm is triggered for a lot of different sensor types at the same time. The hardware itself has no problem (therefore false positive alarms).

We are using VMware ESXi, 6.5.0 U3, Build 15256549 with vCenter 6.7 U3.

We have no idea what is the root cause of this problem.

0 Kudos
Sukanyad
VMware Employee
VMware Employee

can you try the sensor Reset  and restart sfcbd-watchdog service

To clear warnings and errors:

  1. Click the Hardware Status tab.
  2. Click the System event log view.
  3. Click Reset event log.
  4. Click Update to clear the error.
  5. Click the Alerts and warnings view.
  6. Click Reset sensors.
  7. Click Update to clear the memory.

If the issue persists, restart the management agents:

  1. Connect to the ESX/ESXi host using SSH.
  2. Run this command to restart the sfcbd service:

    In ESX: /etc/init.d/sfcbd-watchdog restart
0 Kudos