VMware Cloud Community
AWahlert
Contributor
Contributor

Host Errors / HA agent Errors

Hi List,

i have since a few weeks a strange HA Problem within a two node ESX U1/ VI2.5 cluster.

Once in a hour and only at the same time i see errors in the VI Client regarding the HA functionality:

"HA Agent on server1 in cluster cluster1has an error"

"Insufficient resources to satisfy HA failover level in cluster1"

after aprox 30 seconds:

"Sufficient resources are available to satisfy HA failover level in cluster1"

I have double checked all my DNS propertys (DNS Servers are valid, nslookup are functional, in the /etc/hosts are the esx servers and the VI Server)

I found the the aam_config_util_listnodes.log file. This is the content

KEY: -z VAL: 1

KEY: domain VAL: vmware

KEY: cmd VAL: listnodes

CMD: Mon Sep 8 12:01:09 2008 hostname -s

RESULT:

-


esx04

main::verify_network_configuration:69: cmd status was 0

CMD: Mon Sep 8 12:01:09 2008 /opt/vmware/aam/bin/ft_gethostbyname esx04 |grep FAILED

RESULT:

-


main::verify_network_configuration:69: cmd status was 1

myexit: copying /etc/opt/vmware/aam/vmware-sites to /var/log/vmware/aam/aam_config_util_listnodes.log

FULLTIME_SITES_TID 00000006

+ 1:8042,8042,8043 esx04 vmware #FT_Agent_Port=8045

+ 2:8042,8042,8043 esx03 vmware

VMwareresult=failure

Total time for script to complete: 0 minute(s) and 10 second(s)

ftcli show the right output

/opt/vmware/aam/bin/ftcli -domain vmware -connect esx04 -port 8042 -timeout 60 -cmd listnodes

Node Type State

-


-


-


esx03 Primary Agent Running

esx04 Primary Agent Running

i'm working on it since a few day and i don't have a clue how to solve this.

Can anyone help me with this please??

kind regrads

Andreas

0 Kudos
2 Replies
weinstein5
Immortal
Immortal

I do not think this is a DNS issue - When this occurs are encountering any high resource spikes form your VMs or hosts?

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
AWahlert
Contributor
Contributor

Hi Weinstein,

no there isn't a higher resource spike.

This occurs every hour a few seconds after a full hour (i.e. 10:01:00, 11:02:00 and so on).

After googling for this issue i only found solution for DNS problems. I guess it's a network problem. It seems like the vpxa Agent on the ESX Servers is not reachable. But only every full hour ??

0 Kudos