10 Replies Latest reply on Jun 1, 2011 9:30 AM by rickardnobel

    Is Isolation Response always das.failuredetectiontime - 1?

    rickardnobel Virtuoso

      From Duncan Eppings "HA /DRS Technical Deepdive" I can see that (with default settings) the following will happen:


      on 13 sec: a host which hears from none of the partners will ping the isolation address

      on 14 sec: if no reply from isolation address it will trigger the isolation response

      on 15 sec: the host will be declared dead from the remaining hosts, this will be confirmed by pinging the missing host

      on 16 sec: restarts of the VMs will begin


      My first question is: Do all these timings come from the das.failuredetectiontime? That is, if das.failuredetectiontime is set to e.g. 30000 (30 sec) then on the 28th second a potential isolated host will try to ping the isolation address and do the Isolation Response action at 29 second?


      Or is the Isolation Response timings hardcoded and always happens at 13 sec?


      My second question, if the answer is Yes on above, why is the recommendation to increase das.failuredetectiontime to 20000 if having multiple Isolation Response addresses? If the above is correct then this would make to potential isolated host to test its isolation addresses at 18th second and the restart of the VMs will begin at 21 second, but what would be the gain from this really?