Smoggy 139 posts since
Nov 9, 2005
Reply
1.
Re: OS heartbeat timeouts during test and actual recovery Dec 22, 2008 1:30 AM
i guess the first thing to say is that these timeouts are basically just warnings and not errors so can in nearly all cases be ignored but lets face it they don't look pretty especially as red errors stand out
I suspect you have just updated to ESX/VC U3? if so then I think there was a change made that adjusts the frequency we use to check for vmtools heartbeats. this new value now means that SRM recovery plans miss the first "check" and have to wait for a second chance by which time the timeout warning has been logged.
if the errors really are annonying then you can if you wish (insert disclaimer here for stating this is just my suggestion and not a VMware suggestion) simply edit your ESX hostd config.xml (/etc/vmware/hostd/config.xml) on your recovery site ESX hosts so that the vmsvc section looks like this:
<vmsvc>
<enabled>true</enabled>
<heartbeatDelayInSecs>40</heartbeatDelayInSecs>
</vmsvc>
The value of 40 is just my choice you can use whatever you want. I think the previous default was 20 seconds but that has now been changed via the ESX U3 code. Got a feeling the ESX folks will be issuing a kb article on this. Once that change is made just run following command on service console (or restart ESX whichever is easier

)
- service mgmt-vmware restart
Then run the recovery plan again, no need to restart VC / SRM, and this time the plan ran through to completion without hitting any VMTools timeout errors/warnings. Note: in your recovery plan you should just be able to have the tools timeout values set to their defaults or old defaults of 30 seconds / 300 seconds
best regards,
Lee Dilworth