VMware Cloud Community
matteo78
Contributor
Contributor

HA Failure

HA doesn’t work correctly.

I have a cluster with 2 nodes of esx; when a esx server goes down the other node tries to reconfigure ha, but it faults and sends this message: “Unable to communicate with the remote host, since it is disconnected”

From file vpxa.log I read this error message:

“Authd error: 514 Error connecting to hostd-vmdb instance”

“Failed to connect to host :902. Check that authd is running correctly (lib/connect error 11)” .

there is someone that can help me?

Thanks

0 Kudos
8 Replies
masaki
Virtuoso
Virtuoso

Are the Virtual Center ports opened 902,903,27000,27010,...?

Refer to manual for a complete list

0 Kudos
matteo78
Contributor
Contributor

The VC isn't protect from firewall and the port is open

0 Kudos
matteo78
Contributor
Contributor

The esx server cannot ping the gateway of service console because the firewall (gateway) has a rule that drops icmp.

After creation of a rule that enables icmp from esx's to firewall, the HA is working correctly.

0 Kudos
masaki
Virtuoso
Virtuoso

So my answer was Correct. Firewall Ports problem

Please assign points.

0 Kudos
matteo78
Contributor
Contributor

The port on esx and vcx was all open...

The problem isn't the ports, but the firewall that drop icmp

0 Kudos
masaki
Virtuoso
Virtuoso

icmp it's a connectionless protocol and your error was a connection problem:

“Authd error: 514 Error connecting to hostd-vmdb instance”

“Failed to connect to host :902. Check that authd is running correctly (lib/connect error 11)” .

So I think there was some other port to be enabled, but do as you like.

I wasted too many time yet.

0 Kudos
bflynn0
Expert
Expert

Matteo, just so you understand the reasoning behind needing icmp responses from the service console gateway: When an HA node cannot communicate with the other nodes in the cluster the HA node pings the default gateway, if that is not responsive the HA node assumes it is no longer on the network and isolates itself (by default powering down all of it's VMs). So in the case where you have a two node HA cluster and the gateway does not respond to icmp, if one node fails, the other thinks it is off the network as well and the whole environment goes down.

To address this, you can either enable icmp response to the service consoles (like you have), or you can specify another address for the HA nodes to try when verifying their network connectivity (aka the isolation address).

To set the isolation address, go into the "Advanced Options" within the VMware HA settings enter das.isolationaddress[/b] for the value and

enter the desired IP Address for the value. Click OK to save the settings and reconfigure the HA settings on all of the nodes to update their configuration.

Now, an HA node will ping the das.isolationaddress when it can no longer communicate with the other HA cluster nodes. If it gets a response it knows it's on the network, otherwise it goes into isolation mode.

0 Kudos
masaki
Virtuoso
Virtuoso

Well bflynn your response is sharp and deep.

I thank you for it because you helped me to reconsider my answers.

I must apologize.

I was wrong. The connection problem was a side effect not a solution.

0 Kudos