I'm trying to enable HA on my vRops cluster and it keeps failing with the error
Communication Error: Timeout reached while trying to communicate with the server.
Try refreshing the UI manually
I don't know which log to look in to see why this is happening and I'm trying to resolve it obviously. I have one remote collector and three data nodes in the cluster.
Are all of the data/master nodes in the same physical datacenter and the same L2 network? Thanks
3 data nodes are in the same DC on the same subnet, the RC is in another data center in another state.
I've seen that before in scenarios where there is a firewall in place between nodes, as there is an additional FW rules required for HA functionality (TCP 5433). I've also seen it in scenarios with high network latency. And scenarios where DNS lookup is failing and nodes can't communicate. I'd also ensure the nodes have NTP configured and they're all in sync.
If all the pre-reqs are met (above), I'd open a ticket with GSS so they can have a look.
I did open a ticket on it and post here to see if anyone had any similar issues. The way I got it to enable was to take the cluster offline and enable HA, vmware can't really say why it worked as it doesn't make sense but it did so this is resolved now.
They did say it was timing out on the token from the master node, go figure.
Ya - I'm not a big fan of HA. You need a really compelling use case to enable it IMO given the added headache+complexity.
Sadly I'm just the worker bee, they don't listen to my suggestions always so, it was out of my hands.