7 Replies Latest reply on Feb 1, 2010 5:59 AM by NilC

    VM heartbeat problem

    reinerw Novice

      I created a VM heartbeat alarm on a cluster with ESX 3.5U4 Servers and everything worked as expected.

       

      After an upgrade to ESXi 4.0 U1 the alarm is now always bouncing between green and yellow. This happens on all VMs in the cluster

      (Linux and Windows) and with old or new VMware tools.

      When I move the VM back to a 3.5 host, the bouncing stops.

       

      The vCenter version is 4.0U1

       

      Since the status is also logged in hostd.log, it looks like a problem between the host and the VM, but where ?

       

      Here is an example of a hostd.log from an ESX server:

       

      2010-01-20 13:47:04.320 27A36B90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: yellow

      2010-01-20 13:47:44.322 27AF9B90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: green

      2010-01-20 13:48:04.322 27B7BB90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: yellow

      2010-01-20 13:48:44.325 27AB8B90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: green

      2010-01-20 13:49:04.327 27AF9B90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: yellow

      2010-01-20 13:49:44.327 27B7BB90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: green

      2010-01-20 13:50:04.329 27B7BB90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: yellow

      2010-01-20 13:50:44.331 27A36B90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: green

      2010-01-20 13:51:04.334 10799B90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: yellow

      2010-01-20 13:51:44.336 27BBCB90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: green

      2010-01-20 13:52:04.338 27A77B90 verbose 'vm:/vmfs/volumes/.../test/test.vmx' Updating current heartbeatStatus: yellow

        • 1. Re: VM heartbeat problem
          Jasemccarty Champion
          VMware EmployeesvExpert

           

          I have seen this issue between 4.0 U1 and both 4.0 and 3.5 (all ESX, not ESXi).

           

           

          I opened a SR with VMware about the issue.

           

           

          I have almost 400 guests (which send a heartbeat back to vCenter every minute).  VMware suggested disabling the heartbeat alarm, given that we are not using guest monitoring in HA.  Not sure why that was the recommended fix.  They are still researching the issue.  For now our alarm is disabled.

           

           

           

           

           

          Jase McCarty

          http://www.jasemccarty.com

          Co-Author: VMware ESX Essentials in the Virtual Data Center (ISBN:1420070274) Auerbach

          Co-Author: VMware vSphere 4 Administration Instant Reference (ISBN:0470520728) Sybex

          Please consider awarding points if this post was helpful or correct

           

           

          • 2. Re: VM heartbeat problem
            Jasemccarty Champion
            vExpertVMware Employees

             

            I got a confirmation from VMware that upgrading from vCenter 2.5 to 4.0 is where the problem lies.

             

             

            The issue will be resolved in a future update.

             

             

            Jase McCarty

            http://www.jasemccarty.com

            Co-Author: VMware ESX Essentials in the Virtual Data Center (ISBN:1420070274) Auerbach

            Co-Author: VMware vSphere 4 Administration Instant Reference (ISBN:0470520728) Sybex

            Please consider awarding points if this post was helpful or correct

             

             

            1 person found this helpful
            • 3. Re: VM heartbeat problem
              reinerw Novice

              Thanks for the information that I'm not alone with this problem.

               

              But I did not made an upgrade to vCenter 4.0. I made a new installation and moved the hosts from a 2.5 vCenter to a 4.0 vCenter.

               

              For now I disabled the alarm from green to yellow and waiting for a fix.

              • 4. Re: VM heartbeat problem
                AnatolyVilchinsky Expert

                Can

                                you give us a rough idea of the network setup - specifically what

                                vSwitches and Port Groups have you got for Service Console\Kernel and

                                VM's on the ESX 3.5 box(es) and what Management Port \ VM Port groups

                                have you got on the ESXi box(es)

                 

                from http://serverfault.com/questions/104687/vm-heartbeat-problem

                 

                 

                 

                 

                Starwind Software Developer

                www.starwindsoftware.com

                • 5. Re: VM heartbeat problem
                  krowczynski Master

                  Hi,

                   

                  have the same probleme, while monitoring our vsphere Farm with nworks monitor from veeam.

                  So lets hope, the fixes it a an next update.

                   

                   






                  MCP, VCP

                  • 6. Re: VM heartbeat problem
                    reinerw Novice

                    the network setup is the same for ESXi 3.5 and 4.0 hosts:

                     

                    http://communities.vmware.com/servlet/JiveServlet/downloadImage/8278/network.jpg

                    • 7. Re: VM heartbeat problem
                      NilC Hot Shot

                       

                      We have recently upgraded our test cluster to vCenter vSphere 4U1 and ESXi4.0U1 and see the same thing. Random virtual machines flash up the "VM change state" alarm. Looking at the SNMP traps being sent out by the hosts we see a lot of trap numbers 3 and 4 which according to the MIBs are vmwVmHBLost and vmwVmHBDetected respectiviely.  I used to see these traps being sent out before from ESXi3.5 but of course then there was no alarm defined in vCenter for VM Change State, so I'm not sure if this has always been going on and now we only notice because there are a lot more alarms in vSphere vCenter.

                       

                       

                      But it does seem there are a lot more of these SNMP traps being sent out from ESXi4.0U1 than from ESXi3.5