3 Replies Latest reply on Apr 30, 2020 5:25 PM by santoshannie

    Issue with Windows Cluster communications dropping during Vmotion.

    BrianDougherty Lurker

      Good Morning,

       

      We recently received alerts from Microsoft SCOM that nodes of a cluster were not able to communicate.  As it turns out, the alerts correspond to when the VMs were being migrated from one host to another.  Communication was lost and established again. The communication that is being lost is our Cluster Heartbeat communicaton.  This is a Windows Failover Clustering on a Windows 2012 R2 guest.  The hosts are ESXi 6.0 U1.  VMware tools are running and current on the guest as well. No cluster alerts are generated.  I am just wondering if anyone has a similar configuration and if so, has seen this type of behavior.

       

      I will also be reaching out to Microsoft for this as well.  I wanted to approach it from all angles.

       

      Thank You

       

      Brian Dougherty

        • 1. Re: Issue with Windows Cluster communications dropping during Vmotion.
          Sureshkumar M Expert
          vExpert

          Please check the following guide for vmotion support, some of the clustering type wont support vmotion which will cause failover

           

          Microsoft Windows Server Failover Clustering on VMware vSphere 5.x: Guidelines for supported configurations (2147662) | …

          • 2. Re: Issue with Windows Cluster communications dropping during Vmotion.
            techguy129 Expert
            vExpert

            We had similar issues. The default heartbeat settings are too low for when the VM is stun to transfer to the new host. As such, we adjusted our clusters to the same settings as if the hyper-v role was installed.

            Default Settings

             

            Windows Server 2012 and later: ( MSDN Blog {B.})

             

              Parameter

             

             

              Fast Failover (Default)

             

              Relaxed

             

              Maximum

             

            SameSubnetDelay

             

              1 second

             

              1 second

             

              2 seconds

             

            SameSubnetThreshold

             

              5 heartbeats

             

              10 heartbeats

             

              120 heartbeats

             

            CrossSubnetDelay

             

              1 second

             

              1 seconds

             

              4 seconds

             

            CrossSubnetThreshold

             

              5 heartbeats

             

              20 heartbeats

             

              120 heartbeats

             

            The Fast Failover column defines the default values for WSFC heartbeat. If the servers are on the same subnet or a different subnet, the failover will occur after 5 failed heartbeats that are 1 second part for a total of 5 seconds

            The Relaxed values are the recommended settings when the Hyper-V role is installed. If the servers are on the same subnet the failover will occur after 10 failed heartbeats that are 1 second part for a total of 10 seconds. If the servers are on a different the failover will occur after 20 failed heartbeats that are 1 second part for a total time of 20 seconds.

             

             

            These settings can be configured via powershell

            Import the cmdlets

            Import-Module FailoverClusters

            View current settings

            Get-cluster | fl *subnet*

            Adjust the values

            (get-cluster).SameSubnetThreshold = 10

            (get-cluster).CrossSubnetThreshold = 20

            • 3. Re: Issue with Windows Cluster communications dropping during Vmotion.
              santoshannie Novice

              HI Mate ,

               

              Need your input regarding changing below value fixed your issues :

               

              Do we need to change to the given value :

              Adjust the values

              (get-cluster).SameSubnetThreshold = 10

              (get-cluster).CrossSubnetThreshold = 20

               

              In Our environment , we have multi subnet failover cluster :

               

              Always on Availability cluster , after vmotion , cluster panic and starts failover . .

               

              Regards,
              Santosh