12 Replies Latest reply on Dec 11, 2017 12:40 AM by iganchev

    Migration of a virtual machine when hostd and vpxa are not responding

    iganchev Lurker

      Hello,

      We have a host, part of a cluster, with production virtual machines on it, on which hostd and vpxa services are not responding and they are impossible to be killed or restarted when logged as root on the host.

      My question is, is there a clean method of migrating VMs with a minimum downtime, as they are used in production environment? VMs are stored on a shared disks accessible from other hosts on the cluster.

       

      Regards,

      --

        • 1. Re: Migration of a virtual machine when hostd and vpxa are not responding
          bhards4 Enthusiast

          Hi,

           

          Currently the ESXi host is in unresponsive status where vpxa and hostd services are not responding. As, ESXi host is not responding, its difficult to migrate the VM's from Existing ESXi host to other ESXi host as vpxa agent unable to communicate to vCenter server.

           

          1) One way to restart the management services of ESXi host ( vpxa & hostd services). As you said you don't want to go for that.

           

          2) Other way is to reboot the ESXi host where HA will restart all the VMs to other available ESXi host. (Subject to availability of resources on other hosts)

           

          3) Last option, you can login to ESXi host individually and poweroff all the VM's and unregister them from existing ESXi host to and re-register to other vCenter manageable ESXi host in the cluster.

           

          -Sachin

          • 2. Re: Migration of a virtual machine when hostd and vpxa are not responding
            iganchev Lurker

            Thank you for the answer, will see tomorrow what would be the most appropriate.

            • 3. Re: Migration of a virtual machine when hostd and vpxa are not responding
              daphnissov Virtuoso
              vExpert

              I'm curious to know what you have tried thus far to make the determination that they are impossible to rectify.

              • 4. Re: Migration of a virtual machine when hostd and vpxa are not responding
                iganchev Lurker

                Currently the host is listed as disconnected in the vSphere interface, but the VMs are still running.

                 

                When connected on the host I see a vpxa and hostd process running. When trying to kill/restart those processes I have the following.

                 

                [root@host:~] /etc/init.d/vpxa restart
                watchdog-vpxa: Terminating watchdog process with PID 67719
                sh: can't kill pid 67719: No such process
                

                 

                But:

                 

                [root@host:~] ps |grep vpxa
                231007   67719  vpxa-worker                                      
                67719    67719  vpxa                                             
                67735    67719  vpxa-worker
                

                 

                Same thing for hostd.

                 

                [root@host:~] ps -s | grep hostd
                67292    67292  hostd-worker                                       WAIT    LOCK    0-39
                68101    67292  hostd-worker                                       WAIT    LOCK    0-39
                68102    67292  hostd-worker                                       WAIT    LOCK    0-39
                68113    67292  hostd-worker                                       WAIT    LOCK    0-39
                68119    67292  hostd-worker                                       WAIT    LOCK    0-39
                68122    67292  hostd-worker                                       WAIT    LOCK    0-39
                68970    67292  hostd-worker                                       WAIT    LOCK    0-39
                68971    67292  hostd-worker                                       WAIT    LOCK    0-39
                212913   67292  hostd-worker                                       WAIT    LOCK    0-39
                119831   67292  hostd-worker                                       WAIT    LOCK    0-39
                1497266  67292  hostd-worker                                       WAIT    FS      0-39
                1497269  67292  hostd-worker                                       WAIT    LOCK    0-39
                1497273  67292  hostd-worker                                       WAIT    LOCK    0-39
                2842190  2842190  hostd                                              WAIT    LOCK    0-39
                2842192  2842190  hostd-worker                                       WAIT    UFUTEX  0-39
                2842193  2842190  hostd-worker                                       WAIT    UPOL    0-39
                2842194  2842190  hostd-worker                                       WAIT    UPOL    0-39
                2842195  2842190  hostd-worker                                       WAIT    UFUTEX  0-39
                2842197  2842190  hostd-worker                                       WAIT    UFUTEX  0-39
                2843309  2843309  hostdCgiServer                                     WAIT    UFUTEX  0-39
                1179146  67292  hostd-worker                                       WAIT    FS      0-39
                

                 

                And:

                 

                [root@host:~] kill -9 67292
                sh: can't kill pid 67292: No such process
                

                 

                I think that the situation is not recoverable, but any advice would be welcome.

                 

                Regards

                --

                • 5. Re: Migration of a virtual machine when hostd and vpxa are not responding
                  daphnissov Virtuoso
                  vExpert

                  What version and build of ESXi are you running here?

                  • 7. Re: Migration of a virtual machine when hostd and vpxa are not responding
                    daphnissov Virtuoso
                    vExpert

                    That's the version, what is the build?

                    • 9. Re: Migration of a virtual machine when hostd and vpxa are not responding
                      daphnissov Virtuoso
                      vExpert

                      Try services.sh restart first and see if the watchdog brings them down.

                      • 10. Re: Migration of a virtual machine when hostd and vpxa are not responding
                        msripada Hot Shot

                        I dont recommend services.sh restart as vpxa already showing no such process so it is a zombie now. It requires reboot. If there are existing storage issues or LACP configured, services.sh restart leads to other bigger problems.

                         

                        Thanks,

                        MS

                        • 11. Re: Migration of a virtual machine when hostd and vpxa are not responding
                          daphnissov Virtuoso
                          vExpert

                          There are other options. Calling watchdog.sh -r hostd can be done or with vpxa substituted which should also be tried.

                          • 12. Re: Migration of a virtual machine when hostd and vpxa are not responding
                            iganchev Lurker

                            Hello,

                             

                            thank everyone for the advices. The situation was unsolvable, so we proceeded to a reboot.

                             

                            --

                            IG