3 Replies Latest reply on Jun 29, 2020 4:22 AM by Sukanyad

    Host hardware sensor state Errors

    andvm Enthusiast

      Hi,

       

      In Events I am noticing multiple repeating Alarms/logs on different but identical servers and in different clusters such as:

       

      type memory.Description. Memory Device...state deassert..

      type voltage.Description Processor ..state deassert..

      type systemBoard.Description System Board..state deassert..

      type temperature.Description System Board...state deassert..

       

      I am wondering if these are the cause/or part of it for why I am starting to get hosts disconnected from vCenter due to hostd become unresponsive?

       

      ESXi is 6.7U3 on Dell Servers, iDRAC all green/Healthy

        • 1. Re: Host hardware sensor state Errors
          bbalido9 Novice

          Hi,

           

          If you are getting alert related to hardware error then please engage with your hardware vendor to perform hardware diagnostics.

          Additionally please check hostd.log for "IPMI SEL unavailable".

          This can lead to hostd service being unresponsive hence disconnection state from vcenter.

           

          Hope this helps.

           

          Balido

          • 2. Re: Host hardware sensor state Errors
            mk112 Lurker

            We have the same issue on some of our Dell R740 servers.

             

            Host_Hardware_sensor_state.jpg

             

            As you see in the screenshot the "Host hardware sensor state" alarm is triggered for a lot of different sensor types at the same time. The hardware itself has no problem (therefore false positive alarms).

             

            We are using VMware ESXi, 6.5.0 U3, Build 15256549 with vCenter 6.7 U3.

             

            We have no idea what is the root cause of this problem.

            • 3. Re: Host hardware sensor state Errors
              Sukanyad Enthusiast
              VMware Employees

              can you try the sensor Reset  and restart sfcbd-watchdog service

               

              To clear warnings and errors:

              1. Click the Hardware Status tab.
              2. Click the System event log view.
              3. Click Reset event log.
              4. Click Update to clear the error.
              5. Click the Alerts and warnings view.
              6. Click Reset sensors.
              7. Click Update to clear the memory.

              If the issue persists, restart the management agents:

              1. Connect to the ESX/ESXi host using SSH.
              2. Run this command to restart the sfcbd service:

                In ESX: /etc/init.d/sfcbd-watchdog restart