madman2501
Contributor
Contributor

HA strange intermittent error

Hi all

i have a problem with the HA agent on one of our esx hosts, problem manifests itself with the one esx host reporting an error with HA as follows.

error 03/12/2007 10:39:52 HA agent on esx06.icec.local in cluster Production in York has an error

error 03/12/2007 10:38:52 HA agent on esx06.icec.local in cluster Production in York has an error

this continues throughout the day but does not really manifest any real error other than the ones above.

Little history about the environment

we have 6 esx hosts running esx server 3.0.2,61618 in a cluster, HA is enabled and set to failover if a maximum of 2 hosts fail. ip connectivity is working and dns is working successfully.

Info:

I have checked the host in question and looked for the log files related to the HA agent and could not find the opt/LGTOaam512/log directory, it is not present. could this be an indication as to the problem.

I have checked all other hosts and the directory is present. can anyone tell me how i should proceed as linux is not really an area i am comforable with yet?

Thanks

0 Kudos
5 Replies
emmar
Hot Shot
Hot Shot

Does this error appear on a regular basis, i.e every 3 minutes, and then it resolves itself and then reoccurs?

If so the fix i have found to work is when the ESX hosts is not erroring (i.e does not have the red alert icon) right click on it and try "reconfigure for HA"

let me know

e

0 Kudos
FERC_ESX1
Enthusiast
Enthusiast

HA is very sensitive to DNS issues. Make sure that everything in the cluster is properly registered in DNS.

Also, you may want to make sure your \etc\hosts file contains every ESX host with both it's long and short name

My hosts file is setup as follows

1.2.3.4 host1.domain.tld host1

1.2.3.5 host2.domain.tld host2

etc

0 Kudos
madman2501
Contributor
Contributor

Sorry for the late reply

i have tried that and the error still occurs every so often, with regards to dns, currently each esx host only has itself in the there host file are you suggesting that i need to add all esx hosts into all of the esx host files?

thanks

0 Kudos
jonathanp
Expert
Expert

yes you may want to try to add an entry for each host in each host files...

This is related 90% of time to DNS or time issue..

Check time between host to make sure this is also correctly set.

Jon

0 Kudos
FERC_ESX1
Enthusiast
Enthusiast

Yes. Add both the FQDN and the short name. To minimize typing (and typos) you can copy the same hosts file to all your ESX servers

0 Kudos