vladimir1974
Enthusiast
Enthusiast

vCenter HA with network addresses in separate subnets - two DNS A records cause delay/timeout for clients

Hi,

we've setup vCenter 6.5 HA with network addresses in separate subnets. HA by itself works (everything is green and failover is working).

But, every single client (web browsers, monitoring, vROPs, ssh client...) has issues with connecting to vCenter.

Following Deploying vCenter High Availability with network addresses in separate subnets (2148442) | VMware KB  we created two DNS A records for vCenter FQDN - one points to active and the other one points to passive node external ips. As passive node external ip is not up, clients hit delay or even timeout, depending on which ip they first try to connect.

Is this already known to VMware? I doubt that we are the first customer with vCenter HA with network addresses in separate subnets.

Kind regards,

Vladimir

0 Kudos
9 Replies
shwethalakshman
Contributor
Contributor

Hi,

Can you please check the latency between the 2 hosts?

Try this:

Log into Host 1 : ping Host2 ip or viceversa

As per the recommendation, VCHA is supported on same/different Hosts which are in same/different subnets where latency is less than 10ms

https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/vcha65-perf.pdf

Regards,

Shwetha

0 Kudos
vasan22in
Enthusiast
Enthusiast

Hello,

Refer the below, if it's in different subnet you need to route the HA network properly.

Additional Routing Configurations

If the Active, Passive and, Witness virtual machine's are on different subnets/networks, additional configuration is required.

In order to route the HA network traffic packets properly, you must set the static routes ([Route]) section manually in /etc/systemd/network/10-eth1.network.

After cloning the Passive and Witness virtual machine, set the static [Route] in the Active, Passive and Witness nodes

  1. Active node: Set the [Route] to establish communication to Passive and Witness nodes.
  2. Passive node: Set the [Route] to establish communication to Active and Witness nodes.
  3. Witness node: Set the [Route] to establish communication to Active and Passive nodes.

    For example:


  4. Restart the eth1/HA network using this command: systemctl restart systemd-networkd
  5. Ensure pinging between the Active, Passive and Witness nodes suceeds forthe HA network (eth1 IP address).
  6. Return to the vCenter HA wizard and click Finish.

Deploying vCenter High Availability with network addresses in separate subnets (2148442) | VMware KB

Please consider marking this answer "correct" or "helpful" if you think your query have been answered correctly. Thanks, Srini
0 Kudos
vladimir1974
Enthusiast
Enthusiast

Latency is around 6ms.

Again, vCenter HA is up and running. Everything is green and failover is working.

The problem is with two DNS A records:

$ dig vcenter.example.com +short

1.1.1.1 (current active node)

2.2.2.2 (current passive node)

Clients are connecting randomly to these two ips, so every time they try to connect to passive site, delay or timeout occur.

Regards,

Vladimir

0 Kudos
motowomen143
Contributor
Contributor

Hi ,

1) While deploying VCSA appliance did you use IP or FQDN name? 

2) you/clients should not connect to passive node,when other active node is up and running.

0 Kudos
vladimir1974
Enthusiast
Enthusiast

1) We followed the instructions from VMware KB, so we put IPs where required and FQDN where required.

2) clients don't know which ip is from active/passive node. They get two ips from DNS server, and they pick one of them (randomly).

    If you don't create both A DNS records (as required in VMware KB) you can't configure/provision vCenter HA with advanced routing in the first place.

0 Kudos
Madhuin
VMware Employee
VMware Employee

I think simple solution is make clients to use hostname instead of IP since hostname is unique and also vcsa configured with FQDN name.

vladimir1974
Enthusiast
Enthusiast

Hostname vcenter.example.com has two DNS A records:

1.1.1.1

2.2.2.2

Clients use hostname (vcenter.example.com), query DNS servers for its ips and get both ips in the response - 1.1.1.1 and 2.2.2.2.

One is up (active) and the other one is down (passive).

I think that VMware documentation says that two DNS A records are required.

Anyway, if we have just one A records, we'll have to change it manually (in DNS) on every failover. Which makes failover manual and not automated.

0 Kudos
Madhuin
VMware Employee
VMware Employee

Hi vladimir1974,

Can you please give me output of cat /etc/vmware/systemname_info.json  on active node and passive node?

0 Kudos
vladimir1974
Enthusiast
Enthusiast

Sure, it's:

{"status": "success", "PNID": "vcenter.example.com", "PNIDChangeAllowed": true}

on both active and passive node.

0 Kudos