vCenter

 View Only
  • 1.  HA Error on one out of three hosts

    Posted May 18, 2009 02:42 PM

    Hello,

    I setup a cluster with three servers and enabled HA. Two servers work fine, however, one of the servers (esx1) has an error saying "HA agent on esx1 in cluster CMC in CMC has an error". I checked the logs and it looks like esx1 determines that esx3 is the primary agent but it is unable to connect to it so it determines that there is no agent running on esx3.

    I made sure I could ping esx3 from esx1 and I can. I also made sure it would resolve with the short name and the FQDN and it does. Not sure why it can't connect....any ideas? The log is attached.



  • 2.  RE: HA Error on one out of three hosts

    Posted May 18, 2009 02:47 PM

    are you using resource pools other than your DRS pool? If not, try removing the host from vCenter and then adding it back in.

    check /etc/opt/vmware/aam/FT_HOSTS to see if all nodes are listed correctly as well.



  • 3.  RE: HA Error on one out of three hosts

    Posted May 18, 2009 02:57 PM

    It's worth double checking /etc/hosts and /etc/resolv.conf to make sure all nodes have the same settings.

    Also, looks like you are having trouble with port 8042, might be worth a quick squint over the firewall settings on each host.



  • 4.  RE: HA Error on one out of three hosts

    Posted May 18, 2009 03:05 PM

    to add a common error I have scene is the host name is mistypedwhne configuring the DNS information for your ESX hostss -

    If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful



  • 5.  RE: HA Error on one out of three hosts
    Best Answer

    Posted May 18, 2009 03:40 PM

    When fqdn has been configured correctly, sometimes I just simply uninstall vpxa from problematic ESX hosts, disconnect & remove then add it back to your cluster.

    VMware newbie..

    Zen Systems Sdn Bhd

    www.no-x.org



  • 6.  RE: HA Error on one out of three hosts

    Posted May 18, 2009 04:11 PM

    @Troy: Yes, I am using two resource pools, one for production and one for development vms. FT_HOSTS does not list esx3, but it does list esx2 correctly

    @NWhiley and @weistein5: I checked that the name server is set correctly and the DNS entries on the domain servers. All are correct. I checked that each server can resolve and ping each other as well.

    There should not be any firewall blocking communications between the two servers, however, I will double check this as well. I will also try your suggestion athlon_crazy's suggestion as well.



  • 7.  RE: HA Error on one out of three hosts

    Posted May 18, 2009 04:26 PM

    have you tried to just right click on the host in question and choose reconfigure for HA?



  • 8.  RE: HA Error on one out of three hosts

    Posted May 18, 2009 05:34 PM

    @Troy: Yes, I have tried that as well and it does not fix the error.

    @athlon_crazy: Your suggestion fixed the problem, thank you.

    For future reference, here is what I did which fixed the error.

    1. Disconnect and remove the host from the cluster and vCenter

    2. Run following commands:

    /etc/init.d/mgmt-vmware stop

    /etc/init.d/vmware-vpxa stop

    rpm -qa | grep vpxa

    rpm -e VMware-vpxa-2.5.0-147633

    /etc/init.d/mgmt-vmware start

    3. Reconnect host and add to cluster and no errors!

    Strange, but all I care about is it worked. Thanks.