3 Replies Latest reply on Feb 26, 2008 12:30 PM by DFATAnt

    Hosts Disconnecting in VirtualCenter

    DFATAnt Hot Shot

       

      I have a VirtualCenter server that manages ESX servers located all around the world.  Our WAN links vary in size (and coming out of Australia they aren't as big as other countries around the world generally have), but they are sufficient enough to get the ESX servers to connect to VirtualCenter and work.

       

       

      Now the problem that I have is that on the occasion, I get in to work of a morning to find that an ESX server in one of the remote sites is disconnected from VirtualCenter.  I would be blaming the WAN link for the issue except that we have several ESX servers located at the site and only one is disconnected.  I know the ESX server has still up and working and the guests on the ESX server are also up and working.  I am also able to connect to the ESX server using the VirtualCenter client to login directly.

       

       

      The only way to get the ESX server connected back in to VirtualCenter is to disconnect (there is no option to connect) and then to connect again.  This process takes along time for both the disconnect and the connect.  On the occassion, the connect doesn't work and we either have to restart the vmware-mgmt service or worse still, reboot the ESX server.

       

       

      This problem was happening on VirtualCenter 2.0 and now on 2.5.

       

       

      Does anyone have any suggestions for some fine tuning or anything else that I might be doing that is causing this issue.

       

       

      Cheers

       

       

      Ant

       

       

        • 1. Re: Hosts Disconnecting in VirtualCenter
          serracon.support Enthusiast

           

          i have no real solution to that, but reading the "reboot server" sentence made me post a reply.

           

           

          when you stop the mgmt-vmware service send another stop for vmware-vpxa

           

           

          additionaly do a "ps aux|grep hostd" and kill any such tasks, begin with the watchdog.  then do a "ps aux|grep vpxa" and kill it the same way. (if any hung)

           

           

          you then can start mgmt-vmware and vmware-vpxa in that sequence.

           

           

          this might save most of the reboots.....

           

           

          1 person found this helpful
          • 2. Re: Hosts Disconnecting in VirtualCenter
            RParker Guru

             

            Your problem is DNS.  I had this problem, and eventually figured out that is what is happening.

             

             

            The hosts file on that machine should have the short and FQDN entries for that ESX host name.  Then /etc/sysconfig/network should also be updated properly.  Then the VC should be able to ping the FQDN for that ESXhost.

             

             

            After doing those updates, delete the certificates and reissue new ones.

             

             

            delete/rename the files in /etc/vmware/ssl and do service mgmt-vmware restart to reissue the certificates.  If you do the updates in this order, you can save yourself a reboot and only restart the service once.

             

             

            Verify that the host has the proper DNS entries for the name and IP and that VC communicates to that ESX host name and that ESX host can communicate via FQDN to the VC. 

             

             

            That should fix it.

             

             

            • 3. Re: Hosts Disconnecting in VirtualCenter
              DFATAnt Hot Shot

               

              Thanks RParker,

               

               

              I had DNS setup correctly.  The only thing that I didn't have was the short name of the ESX host in the hosts file.  I think the thing that got the ESX server to connect back in to VirtualCenter was deleting the ssl files and having the certificates reissued.  That was much easier than doing a reboot (which is always the last resort).

               

               

              I'm still at a loss as to why this happened in the first place.  I'm not convinced that the lack of a short name in the hosts file would cause this.  I had another ESX server from the same site disconnect yesterday after I posted this original message, but it was able to connect back to VirtualCenter without any issue, and it doesn't have its short name in the hosts file.

               

               

              Thank again for the help.

               

               

              Cheers

               

               

              Ant