VMware Cloud Community
tstumpf
Contributor
Contributor

Lots of Lost VM heartbeat snmp alerts

I am new to VMWare and to the VMWare Communities, so please excuse me if this is the wrong place to post this question, and please tell me if there is a better place to post these sorts of questions.

I have recently set up a VMWare test environment running ESXi 3.5 on 2 Dell R805 servers connected to an EMC AX4-5 iSCSI san for storage, and about 15 VMs. The servers are set up in a DRS/HA cluster. The environment seems to be working well - performance is better than expected, vmotion is working, DRS is working and HA is working. I have also set it up to send snmp traps on errors.

The problem is that i am frequently getting messages saying a particular VM lost a heartbeat and then about 20 seconds later, a message for the same VM saying the heartbeat was regained. Here is an example...

Device:vmsrv-02.testlab.com 10.30.16.11, Service Tag:, Asset Tag:, Date:02/24/09, Time:08:02:33:000, Severity:Warning, Message:A VM detected a loss in guest heartbeat. 1, /vmfs/volumes/48ffab7b-0ec362b4-e8e8-0019b9dca540/cittest-01/cittest-01.vmx, cittest-01,

Device:vmsrv-02.testlab.com 10.30.16.11, Service Tag:, Asset Tag:, Date:02/24/09, Time:08:02:53:000, Severity:Warning, Message:A VM detected or regained the guest heartbeat. 1, /vmfs/volumes/48ffab7b-0ec362b4-e8e8-0019b9dca540/cittest-01/cittest-01.vmx, cittest-01,

I have browsed the communities and have not found a solution, however since I am new, i may be checking the wrong places.

I verified,the version of the VMWare Tools on all of the VMs and in some cases, reinstalled the VMWare Tools, but that has not eliminated the trouble. I also opened a support case to have someone check it out. They had me upload the diagnostic logs, but eventually, they basically said that everything else looks good, and since these are only warning messages, that they can be ignored.

I just want to get a second and third opinion from anyone else to see if ignoring these snmp traps is the best solution, or if there is something else that I can try to eliminate these warning snmp traps. Is this an uncomon problem? Any advise on how to proceed from anyone?

Tags (4)
0 Kudos
2 Replies
Ken_Cline
Champion
Champion

Moved to the VMware ESXi forum.

Ken Cline

Technical Director, Virtualization

Wells Landers

TVAR Solutions, A Wells Landers Group Company

VMware Communities User Moderator

Ken Cline VMware vExpert 2009 VMware Communities User Moderator Blogging at: http://KensVirtualReality.wordpress.com/
0 Kudos
josephus
Contributor
Contributor

I'm experiencing the same using the latest patched U4.

FreeBSD guest with open-vm-tools, heatbeat lost for exactly 20 secs.

Linux guest with factory vmware tools, heartbeat lost for exactly 20 secs.

DISMAN-EVENT-MIB::sysUpTimeInstance  13:8:16:17.77
SNMPv2-MIB::snmpTrapOID.0  VMWARE-TRAPS-MIB::vmHBLost 
DISMAN-EVENT-MIB::sysUpTimeInstance  13:8:16:37.77
SNMPv2-MIB::snmpTrapOID.0  VMWARE-TRAPS-MIB::vmHBDetected

Has this issue already been addressed?

0 Kudos