4 Replies Latest reply on Jun 23, 2020 4:40 AM by spfma

    vSphere 7.0 : ESXI not responding

    spfma Lurker

      Hi,

       

      I am setting up a brand new cluster of four ESXi with plenty of resources, 10G network, ...

      All the hypervisors have been configured with a script, so there is no differences or missing items between them.

      But I have a problem with one of them : if I reboot it, it will not reconnect after completion. And when I do that manually, I still get an warning "Cannot synchronize host". After some time, it goes to "not responding" state but of course the ESXi and VMs are actually running fine.

       

      What does this error really mean ?

       

      NTP seems to be working, all hosts and VCA have the same time.

      It is a dedicated infrastructure under test, so there is no network or computing resources overload.

       

      I have disconnected/connected the host, removed it and added again, but it can not come back to a reliable state.

       

      Any idea ?

       

      Regards 

        • 1. Re: vSphere 7.0 : ESXI not responding
          bbalido9 Novice

          Hi,

           

          Please check hosts, vpxa & vmkernel.log files to understand why host is going to not responding state.

          Otherwise please upload files with timestamp to review.

          • 2. Re: vSphere 7.0 : ESXI not responding
            spfma Lurker

            Hi,

             

            I still haven't found clues, but it reminds me problems I have read about here but I can't find the posts again.

             

            Everything works fine until I reboot the server. Of course it gets disconnected and doesn't connect automatically (should it) when back online. Reboot time is long, as the server features around 700Gb RAM.

             

            If I connect it manually, it will disconnect soon after.

             

            But if I remove it from the inventory, add it again, it will connect and stay connected (I left it like that for a couple of days).

             

            Of course, on the next reboot, same mess ... and only one machine.

             

            I might end reinstalling it, but I would like to understand what's happening.

            • 3. Re: vSphere 7.0 : ESXI not responding
              bbalido9 Novice

              Hi,

               

              I would advise checking for vpxa.log to understand if the host is losing Heartbeat to vcenter.

              Secondly removing the host from vcenter inventory and adding back point to Vcenter database entry where the host is registered and given unique ID; here the issue might be with either stale or duplicate ID pointing to that ESXi host which will require validation or cleanup within vcenter DB.

               

              PS: will require log bundle from both ESXi host and vcenter to understand exactly what's happening when the host is pushed out or get disconnected from vcenter.

               

              I would request to get the log bundle even though if you will be reinstalling the host.

               

              Thanks,

               

              Balido

              • 4. Re: vSphere 7.0 : ESXI not responding
                spfma Lurker

                Hi,

                 

                For the present time, I give up,

                 

                I don't see anything helpful in the logs, mostly because I don't really know what error messages mean (some seem to be harmless even if marked as errors).

                I have extracted the logs from the export bundle, and anonymized them. But the content is genuine.

                 

                Just for science's sake, I decided to restart VS appliance and guess what : everything was all green when I was able to log again. But after rebooting one ESXi, back to reality !

                 

                Regards