VMware Cloud Community
dylanebner
Contributor
Contributor
Jump to solution

HA Agent error when adding ESX 3.0.1, 32039 host to VirtualCenter 2.5 84767 Cluster

I am trying to add an ESX 3.0.1 host to a cluster on VirtualCenter 2.5 and I am getting the "HA agent on xxxxxx in cluster xxxx in xxx had an error" message. I have checked DNS names, connectivity between the hosts and everything appears to be functioning. I already have an ESX 3.5 server joined to the cluster and it is working fine. Funny thing is, when I went to check the /opt/LGTOaam512 folder for the logs on the ESX host, I cannot find the folder. Am I looking in the right place? I also cannot find the folder on the working ESX 3.5 server as well. All I have in my /opt folder is a dell folder, a navisphere folder and vmware folder.

I know the VirtualCenter can manage the esx 3.0.1 host because I can connect it outside a cluster.

Any help would be apreciated.

Thanks

0 Kudos
29 Replies
dylanebner
Contributor
Contributor
Jump to solution

well, not really. at least this is my experiance so far. the compatability guide says it can be done with a couple of patches. I have two of the three patches applied but I still cannot get ha working. everything else works fine. drs, clusters, resource pools.

0 Kudos
dylanebner
Contributor
Contributor
Jump to solution

Ok, well I bit the bullet and upgraded the host to esx 3.5 update 1 on the old 3.0.1 host. Guess what, HA still does not work!. I am going crazy here.I can get HA working on one, but not both hosts. If I disconnect "hostA" and then enable HA, "hostB" will start HA. If I disconnect "hostB" and then config HA, "hostA" will start HA. Only when I try both at the same time do I have problems. Where should I go from here?

0 Kudos
kjb007
Immortal
Immortal
Jump to solution

Check your names and IP addresses. There is some mismatch here.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
dylanebner
Contributor
Contributor
Jump to solution

I have checked many times. I can ping every hostname (lower and upper case spellings) from each of the ESX servers and from the VC server. It just doesn't make sense.

Should hostnames be all lower case?

0 Kudos
kjb007
Immortal
Immortal
Jump to solution

Check your FT_HOSTS file on the problem server, (/etc/opt/vmware/aam). This should contain all the hostnames. Make sure this is correct.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
dylanebner
Contributor
Contributor
Jump to solution

On the problem host I do not have a FT_HOSTS file. I only have: backup.def.run ftbb.prm.bck NicInfo. On the host that works correctly, I do have a FT_HOSTS file, but host that does not work is not listed.

When I check the logs it looks like the HA services are running, but maybe I am checking the wrong log files.

0 Kudos
kjb007
Immortal
Immortal
Jump to solution

I'd hate to have you do this again, but can you try the manual procedure to remove and re-add the host again?

One other thing to check is to make sure that the /etc/sysconfig/network-scripts/ifcfg-vswif0 file and make sure the MAC address is different on both hosts.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
rbugles
Enthusiast
Enthusiast
Jump to solution

All of your hosts are added to the VC inventory by name, and not by IP correct? I am trying to track down the deployment docs, but I seem to recall that in 2.5 HA requires DNS resolution to work, it no longer supports just IP addressing.

This can be overcome by putting the correct info into the host file as well.

Rob

0 Kudos
dylanebner
Contributor
Contributor
Jump to solution

UGHHHH! I should have let sleeping dogs lie. I was using IP addresses. I changed the host that was failing to FQDN and then it started to work. I then changed the host that always worked to FQDN and now it doesn't work! I then tried to unconfigure HA, but now one of the hosts is stuck at 5% unconfiguring HA!

0 Kudos
dylanebner
Contributor
Contributor
Jump to solution

Holy CRAP! I went back to check on the stuck config, and now HA is working! Finally!

Thanks Everyone. Hopefully it will be stable.!

0 Kudos