3 Replies Latest reply on Oct 1, 2009 11:20 AM by beyondvm

    HA Alogrithm used to restart VM's

    campbellabbey Lurker


      I have a query about HA regarding the alogirith (be it ‘parallelism'/'serialism') used to DRS a number of VMs per host on a cluster where the VM's were prviously housed on a Host that has gone down



      For example If a host suffers a failure and that Host has 20 VMs on it, how does HA determine;


      • i) Which host(s) to restart those recognised failed VMs

      • ii) Does HA restart VMs sequentially or in parallel and are there any ‘number rules' - for example it can only start 5 VMs on the first host it selects for recovery, then another 5 on another host etc etc. How does the algorithm essentially work?








        • 1. Re: HA Alogrithm used to restart VM's
          depping Champion
          User ModeratorsVMware Employees

          HA starts VM to it's best knowledge and specified startup order in the HA Cluster Config. In other words HA nodes share resource usage everyonce in a while and this is how it decides where to start VMs. It does not actively use DRS at all during this process. DRS kicks in as soon as the VM has started though. I'm not sure what the parallel boot limit is, but from memory it starts them "serial" anyway.






          VMware Communities User Moderator | VCP | VCDX


          Blogging: http://www.yellow-bricks.com

          Twitter: http://www.twitter.com/DuncanYB (*NEW*)

          Available Soon: vSphere Quick Start Guide (http://www.yellow-bricks.com/2009/08/12/new-book-in-town-vsphere-quick-start-guide/)


          If you find this information useful, please award points for "correct" or "helpful".

          • 2. Re: HA Alogrithm used to restart VM's


            HA uses a simple placement algorithm to decide which hosts to failover vms to. It sorts the hosts in order of percentage of unreserved capacity (memory and cpu). Then it sorts the vms in order of priority and for each vm picks the first host in the list that has sufficient resources to satisfy the vms memory and cpu reservations.



            Failover of vms is done in parallel, not serially.









            1 person found this helpful
            • 3. Re: HA Alogrithm used to restart VM's
              beyondvm Hot Shot

              The first thing to remember is that HA's primary goal is to get all of your VMs back online.  If it can do that then it will, as fast as possible.  In my testing I saw my environment start the VMs in near parallel because the rest of the cluster had more than enough resources to support it.  Its my understanding that the only time the restart priority or any of HA's placing algorithms kick in is when there are not enough resources available to satisfy __reserved__ resources.  That is, if you are not using reservations then HA will boot up everything even if it will cause paging (haven't verified that it will cause paging intentionally for sure, but this is my understanding) and it will ignore the restart priority in the HA settings.  The reason for this is without reservations how can the software understand your intentions.  From my understanding there is no way to directly set the number of VMs that are able to boot onto a host in a failure scenario apart from having reservations and this causing a situation where the admission process needs to take place.  Depping has the most in-depth guide that I have seen on his site about HA but in the end remember HA's primary goal.






              If you found any of my comments helpful please consider awarding points for "Correct" or "Helpful". Thanks!!!



              1 person found this helpful