VMware Cloud Community
lombardia2k
Contributor
Contributor

HA Agent has an error (timeout) - not a DNS issue

Hello.

I experienced a problem while adding my first two newly-installed ESXi hosts to a VC Cluster (HA+DRS).

I add with no problems the first one.. HA configures well and with no errors.

When I try to add the second host, the HA Agent configuration hangs at 86% and after three minutes it goes into timeout returning a message which says that "HA Agent has an error". The error detail also says that configuration failed because the command ft_startup has not been completen within 3 minutes.

I already found other thread and posts regarding this issue but on ESX, not ESXi.

They reports DNS problems.. but I haven't any DNS issue because I am able to ping all hosts from the ESXi hidden console with "short" hostname and also with the complete FQDN (hostname.domain). Reverse DNS is also OK (only FDQN, obviously). Hostnames are all OK in /etc/hosts and are all lowercase.

Some information:

Domain name: itt.ferrovienord

DNS: 10.2.10.111 (dhcp-dns.itt.ferrovienord)

VirtualCenter: 10.2.10.100 (virtualcenter.itt.ferrovienord)

ESXi #0: 10.2.30.200 (saturn.itt.ferrovienord)

ESXi #1: 10.2.30.201 (venus.itt.ferrovienord)

Thank you so much for your help!!

Alessandro

0 Kudos
11 Replies
weinstein5
Immortal
Immortal

Welcome to the forums - in VC is the host name entered correctly for the esx host? I have seen similar problems when the host name is mispelled on the esx host -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
lombardia2k
Contributor
Contributor

I added the hosts into VC with their FQDN (hostname.domain).. is it right?

0 Kudos
lombardia2k
Contributor
Contributor

You can see the error message attached to this post.

I now also tried to add the hosts with their hostname (saturn, venus) and not FQDN (hostname.domain).. but same result.

0 Kudos
lombardia2k
Contributor
Contributor

Any idea?

I can't go ahead..

Thank you.

0 Kudos
weinstein5
Immortal
Immortal

Have you tried disabling the HA cluster and re-enable - maybe starting with the host that has the error first - it will not affect any of the running vms -doing it in different wil show if theproblem is with this host or something systemic -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
lombardia2k
Contributor
Contributor

Tried a LOT of times but no success. Deleted the cluster, re-created, renamed, renamed machines... (we are not in a production environment, we are only trying VI).. the only thing I haven't tried yet is changing domain name.. because it's impossible for us!

I tried to add "saturn" before "venus" and vice versa... The FIRST one I add goes up without errors, the second one hangs on HA...!

0 Kudos
weinstein5
Immortal
Immortal

Because you get the same error when you reverse the order of creating the HA Cluster - I am leaning that is a DNS/Networking issue - can you post dns setting form the configuration tab of each host as well as the networking configuration for your SC Ports -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
lombardia2k
Contributor
Contributor

Tomorrow I will post what you requested!

But.. I don't understand what do you mean with "configuration for your SC Ports"...!

0 Kudos
weinstein5
Immortal
Immortal

Never mind that - see you had ip information in a previous post -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
lombardia2k
Contributor
Contributor

1) First host, saturn.itt.ferrovienord

IP Configuration tab:

IP Address: 10.2.30.200

Subnet mask: 255.255.0.0

Default gateway: 10.2.0.1

DNS configuration tab:

Primary Server: 10.2.10.111

Secondary Server: 0.0.0.0

Hostname: saturn.itt.ferrovienord

Custom DNS Suffixes: itt.ferrovienord

"Test Management Network" returns ALL "OK"

From "unsupported" console access I get:

/etc/resolv.conf

nameserver 10.2.10.111

search itt.ferrovienord

/etc/hosts

127.0.0.1 localhost.localdomain localhost

10.2.10.200 saturn.itt.ferrovienord saturn

nslookup virtualcenter OK

nslookup virtualcenter.itt.ferrovienord OK

nslookup saturn OK

nslookup saturn.itt.ferrovienord OK

nslookup venus OK

nslookup venus.itt.ferrovienord OK

ping virtualcenter OK

ping virtualcenter.itt.ferrovienord OK

ping saturn OK

ping saturn.itt.ferrovienord OK

ping venus OK

ping venus.itt.ferrovienord OK

2) Second host, venus.itt.ferrovienord

IP Configuration tab:

IP Address: 10.2.30.201

Subnet mask: 255.255.0.0

Default gateway: 10.2.0.1

DNS configuration tab:

Primary Server: 10.2.10.111

Secondary Server: 0.0.0.0

Hostname: venus.itt.ferrovienord

Custom DNS Suffixes: itt.ferrovienord

"Test Management Network" returns ALL "OK"

From "unsupported" console access I get:

/etc/resolv.conf

nameserver 10.2.10.111

search itt.ferrovienord

/etc/hosts

127.0.0.1 localhost.localdomain localhost

10.2.10.201 venus.itt.ferrovienord venus

nslookup virtualcenter OK

nslookup virtualcenter.itt.ferrovienord OK

nslookup saturn OK

nslookup saturn.itt.ferrovienord OK

nslookup venus OK

nslookup venus.itt.ferrovienord OK

ping virtualcenter OK

ping virtualcenter.itt.ferrovienord OK

ping saturn OK

ping saturn.itt.ferrovienord OK

ping venus OK

ping venus.itt.ferrovienord OK

Thank you for your help.

0 Kudos
cnhianda
Contributor
Contributor

Any resolution on this?

Having the same issue with U3 on ESXi.. HA will not configure

0 Kudos