4 Replies Latest reply on Jul 26, 2011 8:52 AM by peter79

    ESXi host enters isolation mode after management NIC is reconnected

    peter79 Enthusiast

      Guys,

       

      I've discovered a very strange problem.

       

      I have 2 ESXi hosts (both running 4.0) in a DRS/HA cluster and am running vCenter 4.1.  The management network has 2 NIC's.  We were testing the redundancy on the management network so we plugged out on of the NIC's.  The management network stayed up.  However when we reconnected the NIC the host went into isolation mode and migrated all the NICs to the remaining host in the cluster.

       

      We had had a second cluster which was connected to a different switch (they were running ESXi 4.1) so we tried to recreate the situation.  The same thing happened.  We set the management NICs to 1000/Full as well as auto negotiate but the result was the same.  I've attached a log file from one of the boxes in case that helps.

       

      I have no idea whats going on any ideas?

        • 1. Re: ESXi host enters isolation mode after management NIC is reconnected
          MauroBonder Champion
          User ModeratorsVMware Employees

          change isolation response of HA to leave power on.

           

          if you use nic teaming, check configuration of vswitch :

           

          recommended

          • Network      failover detection = Link Status Only
          • Notify      switches = yes
          • Failback      = no

           

           


          Please, don't forget the awarding points for "helpful" and/or "correct" answers. 

           

          Mauro Bonder - Moderator

          • 2. Re: ESXi host enters isolation mode after management NIC is reconnected
            peter79 Enthusiast
            We have done the following tests
            Test 1
            We had configured vNIC 4 and 5 as active/active (in that order).  When we disconnected vNIC 4.  When we reconnected it isolation mode was triggered and the host vMotioned the test VM on the host.  HA was also enabled.
            Test 2
            We had configured vNIC 5 and 4 as active/active (in that order).  When we disconnected vNIC 5.  When we reconnected it isolation mode was not triggered and the test VM remained on the host.  HA was also enabled.
            Test 3
            We removed vNIC 4 from the vSwitch and and replaced it with vNIC 6.  We had configured vNIC 5 and 6 as active/active (in that order).  When we disconnected vNIC 5.  When we reconnected it isolation mode was triggered and the host vMotioned the test VM on the host.  HA was also enabled.
            Test 4
            We removed vNIC 4 from the vSwitch and and replaced it with vNIC 6.  We had configured vNIC 6 and 5 as active/active (in that order).  When we disconnected vNIC 6.  When we reconnected it isolation mode was not triggered and the test VM remained on the host.  HA was also enabled.
            Test 5
            We repeated test 1 with HA disabled and isolation mode was not triggered and the test VM remained on the host.
            Conclusion
            It appears to be a HA has an issue when the vNIC with the lowest ordering is reconnected.  Has anyone seen this type of behavior before? 
            • 3. Re: ESXi host enters isolation mode after management NIC is reconnected
              MauroBonder Champion
              User ModeratorsVMware Employees

              Did have any difference of configuration in vminc4 in your switch core about VLAN ?!

               


              Please, don't forget the awarding points for "helpful" and/or "correct" answers. 

               

              Mauro Bonder - Moderator

              • 4. Re: ESXi host enters isolation mode after management NIC is reconnected
                peter79 Enthusiast

                The configuration of the physical switch ports are identical.  Also the problem isnt vNIC 4.  As you can see from the tests When I used vNIC's 5 and 6 I was had the same problem with vNIC 5.  It seems as if which ever vNIC has the lowest order causes the problem!