VMware Cloud Community
MAHC
Enthusiast
Enthusiast

ESX 3.5 U2 HA Error

I have created a cluster for my View clients and have 2 ESX 3.5 U2 systems license with an ESX VDI License. I added both systems to the cluster and i started getting errors saying Configuration Issues, HA agent on SYSTEM NAME in cluster SYSTEM NAME in DOAMIN NAME has an error. What can I do to resolve this. I have the systems removed from the cluster but they show as dissconnected and I am unable to re conenct them. Any thoughts?

0 Kudos
25 Replies
Troy_Clavell
Immortal
Immortal

HA, in particular in 3.5 is heavily reliant on name resolution. Ensure you have all you names setup correctly. Make sure you /etc/hosts file has IP, FQDN, and shortname. check your FT_HOSTS file

... and review this KB

http://kb.vmware.com/kb/1003691

0 Kudos
MAHC
Enthusiast
Enthusiast

I went through and checked everything and found tht I was supposed to deleate teh FT_HOSTS file and thats what I forgot but I still have the message "HA agent on HOST in cluster HOST in DOMAIN has an error under the summary tab in my infrustructure client. How do i remove that also. By removeing the FT_HOSTS file I was able to reconnect my system to virtual center and get them out of the cluster.

Thanks for the help.

0 Kudos
Troy_Clavell
Immortal
Immortal

right click on the host in question and choose "Reconfigure for VMware HA". There should be a FT_HOSTS file under /etc/opt/vmware/aam/

0 Kudos
MAHC
Enthusiast
Enthusiast

Does the device have to be in a HA cluster for this option to appear? It is currently no in any cluster and I am not seeing that option when I right click on the ESX host.

0 Kudos
Troy_Clavell
Immortal
Immortal

yes, it needs to be added back into an HA Cluster. If you are going to run a stand alone machine, you can remove the HA software, that error should then go away

http://vmwaretips.com/wp/2008/10/20/advanced-settings-for-vmware-ha/

0 Kudos
Cruicer
Enthusiast
Enthusiast

try this...

  • Disable HA/DRS from the cluster

  • Disconnect and remove the troubled ESX host from VC (This will not impact the VMs only your ability to manager them via the VI Client)

  • Open a SSH session to the troubled ESX host (Putty)

  • Remove the VPXA Agent

    • Stop the service "Service vmware-vpxa stop" from the console

    • Type "rpm -e vmware-vpxa" from the console

Re-add your ESX host back into the cluster. and re-enable HA/DRS

I had a similar issue but with DRS...could be something screwy with your vpxa agent. Assuming you are 100% with all your DNS settings.

0 Kudos
MAHC
Enthusiast
Enthusiast

I got to this step and it tells me that the package is not installed: Type "rpm -e vmware-vpxa" from the console

I overall don't want this to be a stand alone device I am looking to get it into a cluster for Vmware View clients. When I did remove the software the error did go away but I Still couldn't get it to add the HA software back properly...

0 Kudos
Troy_Clavell
Immortal
Immortal

add your host back into the HA cluster, it should configure it for you automatically.

0 Kudos
MAHC
Enthusiast
Enthusiast

Now I am getting the error: Unable to contact the specified host. This might be because the host is not available on the network, there's a network configuration problem, or the management services on the host are not responding. I have tried restarting the host and starting the services manually with no success. Is there any other ways to re add the host?

0 Kudos
Troy_Clavell
Immortal
Immortal

restart hostd one more time.

service mgmt-vmware restart

0 Kudos
MAHC
Enthusiast
Enthusiast

I still get the same error.....

0 Kudos
MAHC
Enthusiast
Enthusiast

I add the host as a stand alone and it is re-added to virtual center. I have re added it to the DRS cluster and it is working fine. I want to do the same changes to my other ESX server and then add HA again to the cluster and see if it works.

0 Kudos
Troy_Clavell
Immortal
Immortal

good stuff!!

0 Kudos
Troy_Clavell
Immortal
Immortal

let me know how it all turns out.

please consider awarding points for any "correct" or "helpful" answers.

0 Kudos
MAHC
Enthusiast
Enthusiast

DRS is still working fine but I am still getting the yellow box in the summary tab that says "Configuration Issues: HA agent on ESX SERVER in cluster NAME in DATA CENTER has an error. When I activate HA in the cluster in my recent tasks are in the infrastructure client it shows the errors (ESX SERVER1: An error occurred while communicating with the remote host.) (ESX SERVER 2: An error occurred during the configuration of the HA Agent on the host).

What do I do from here?

0 Kudos
Troy_Clavell
Immortal
Immortal

have you checked your FT_HOSTS file? Now that it's back in the cluster you should have the file and it should contain all ESX hosts that are in the cluster

Also, what happens when you right click on the host and reconfigure for HA?

0 Kudos
MAHC
Enthusiast
Enthusiast

On my second server I am unable to find the FT_HOSTS file in the 2 locations that are mentioned in the documentation (/etc/FTHOSTS and /etc/opt/vmware/aam/FT_HOSTS) . Also when I do the reconfigure I get the error "An error occurred during configuration of the HA Agent on the host" in the Tasks pane.

0 Kudos
Troy_Clavell
Immortal
Immortal

I don't know what the ramifications are in just copying the FT_HOSTS file from another server in your cluster would be, but that file should be present if ha is running.

What happens if you do a service vmware-aam stauts? You should see it running. If not, try to start it by doing a service vmware-aam start

0 Kudos
MAHC
Enthusiast
Enthusiast

In the FT_HOSTS file is shows the management IP address and the ISCSI IP address of its self and if there on both servers. I must have missed it before... Also when I check the status it does show as it is running both when it is in the cluster with active HA and when it has been removed.

0 Kudos