HA Configuration time out when adding a new host

HA Configuration time out when adding a new host

I would like to share my recent experience when troubleshooting a HA issue.Be default when trying adding a new host to the existing cluster or when you reconfigure a HA on one of the existing host in the cluster HA opearing will timeout.To fix the issue eithwe we have to disable and enable the Ha on the cluster level or reconfigure HA on the master node

vpxd logs

vpxd-19.log:2017-09-26T01:17:53.673Z info vpxd[7FA6DC311700] [Originator@6876 sub=vpxLro opID=lro-53-79f246b8-02] [VpxLRO] -- BEGIN task-147951 -- svrau-esx03.jbhi-fi.local -- DasConfig.ConfigureHost --

vpxd-19.log:2017-09-26T01:17:53.673Z info vpxd[7FA6DC311700] [Originator@6876 sub=MoHost opID=lro-53-79f246b8-02] [HostMo::UpdateDasState] VC state for host host-220 (initialized -> uninitialized), FDM state (Live -> Live), src of state (null -> null)

vpxd-19.log:2017-09-26T01:17:53.965Z info vpxd[7FA6DC311700] [Originator@6876 sub=DAS opID=lro-53-79f246b8-02] [VpxdDasConfigLRO::ConfigureResources] Skipping aam RP config for ESX 6+ host

vpxd-19.log:2017-09-26T01:17:54.389Z info vpxd[7FA6DC311700] [Originator@6876 sub=HostUpgrader opID=lro-53-79f246b8-02] [VpxdHostUpgrader] Fdm on host-220 has build 5973321. Expected build is 6671409 - will upgrade

vpxd-19.log:2017-09-26T01:17:54.578Z info vpxd[7FA6DC311700] [Originator@6876 sub=HostAccess opID=lro-53-79f246b8-02] Using vpxapi.version.version11 to communicate with vpxa at host svrau-esx03.jbhi-fi.local

vpxd-19.log:2017-09-26T01:21:41.970Z warning vpxd[7FA6DC698700] [Originator@6876 sub=VpxProfiler opID=lro-53-79f246b8-02-TaskLoop-dc7c952] TaskLoop [TotalTime] took 222225 ms

vpxd-19.log:2017-09-26T01:21:42.104Z info vpxd[7FA6DC311700] [Originator@6876 sub=DAS opID=lro-53-79f246b8-02] [VpxdDasConfig::PushConfigToFDM] pushed config version 127  to host [vim.HostSystem:host-220,svrau-esx03.jbhi-fi.local] (cluster [vim.ClusterComputeResource:domain-c31,AU001_CLUSTER_GENERAL])

vpxd-19.log:2017-09-26T01:23:42.120Z error vpxd[7FA6DC311700] [Originator@6876 sub=DAS opID=lro-53-79f246b8-02] Timed out waiting for election to complete or for host to join existing master

vpxd-19.log:2017-09-26T01:23:42.120Z error vpxd[7FA6DC311700] [Originator@6876 sub=DAS opID=lro-53-79f246b8-02] EnableDAS failed on host [vim.HostSystem:host-220,svrau-esx03.jbhi-fi.local]: N3Vim5Fault8Timedout9ExceptionE(vim.fault.Timedout)

vpxd-19.log:2017-09-26T01:23:42.121Z info vpxd[7FA6DC311700] [Originator@6876 sub=MoHost opID=lro-53-79f246b8-02] [HostMo::UpdateDasState] VC state for host host-220 (initialized -> init error), FDM state (UNKNOWN_FDM_HSTATE -> UNKNOWN_FDM_HSTATE), src of state (null -> null)

vpxd-19.log:2017-09-26T01:23:42.184Z info vpxd[7FA6DC311700] [Originator@6876 sub=DAS opID=lro-53-79f246b8-02] [VpxdDasConfigLRO::Cleanup] Number of unprotected vms: 24

vpxd-19.log:2017-09-26T01:23:42.184Z warning vpxd[7FA6DC311700] [Originator@6876 sub=VpxProfiler opID=lro-53-79f246b8-02] VpxLro::LroMain [TotalTime] took 348510 ms

vpxd-19.log:2017-09-26T01:23:42.184Z info vpxd[7FA6DC311700] [Originator@6876 sub=vpxLro opID=lro-53-79f246b8-02] [VpxLRO] -- FINISH task-147951

vpxd-19.log:2017-09-26T01:23:42.184Z info vpxd[7FA6DC311700] [Originator@6876 sub=Default opID=lro-53-79f246b8-02] [VpxLRO] -- ERROR task-147951 -- svrau-esx03.jbhi-fi.local -- DasConfig.ConfigureHost: vim.fault.Timedout:

vpxd-19.log:2017-09-26T01:24:15.373Z warning vpxd[7FA6DE17B700] [Originator@6876 sub=VpxProfiler opID=lro-53-79f246b8-02-EventManagerProcessJobs-4606457a] EventManagerProcessJobs [TotalTime] took 33248 ms

From the above snippets it is evident that HA configuration is timing because of under lying network latency between the esxi hosts.This issue can be fixed by increasing the default timed out value for FDM in the Vcenter server advanced settings

Applying a VMware HA customization

Using the vSphere Web Client

Log in to VMware vSphere Web Client.

Go to Home > vCenter > Clusters.

Under Object, click on the cluster you want to modify.

Click Manage.

Click vSphere HA.

Click Edit.

Click Advanced Options.

Click Add and enter in Option add config.vpxd.das.fdmWaitForUpdatesTimeoutSec and Value field set it t0 60

Deselect Turn ON vSphere HA.

Click OK.

Wait for HA to unconfigure, click Edit and check Turn ON vSphere HA.

Click OK and wait for the cluster to reconfigure.

Tags (1)
Version history
Revision #:
1 of 1
Last update:
‎03-15-2018 12:41 PM
Updated by: