VMware Cloud Community
lvaibhavt
Hot Shot
Hot Shot

unable to trigger APD HA reponse i.e. VM restart on other hosts in lab

Hi Team,

 

I am trying to trigger APD HA response i.e. VM's should restart on other hosts in the cluster (with access to storage) however it is not working. Has anyone had a chance to trigger APD HA response in lab ?

I have disabled DRS and only running HA.

I see in events stating APD seen however VM's failover to other hosts is not happening.

Any suggestions here ? Thanks

0 Kudos
6 Replies
depping
Leadership
Leadership

How are you triggering the APD? Does the APD get declared according to the log files? And how is HA configured?

0 Kudos
lvaibhavt
Hot Shot
Hot Shot

Hi Duncan,

Thank you for replying

 

I have attached snippet for cluster config. APD & PDL both are enabled. I see APD/PDL message on esxi event tab

In my lab I have openfiler and running the setup in VMware workstation. I have software iscsi adapter

 

To generate APD / PDL ; I have tried below steps

1) run below command +++ remove the iscsi nics from iscsi storage port binding + disabled management nic to access storage (open filer) ---- did not help
esxcli iscsi session remove -A vmhba65

 

2) as I am running in VMware workstation there is no vlan configured ; I intentionally entered vlan number 22 to stop storage communication ----- did not help (( this step used to work for me in previous vsphere versions ))

 

3) did not have iscsi port binding ; from the management network I am accessing openfiler ; removed storage IP from dynamic and static adapter binding ------ did not help

 

 

any suggestion what else I can try here

0 Kudos
lvaibhavt
Hot Shot
Hot Shot

attaching other pic

0 Kudos
lvaibhavt
Hot Shot
Hot Shot

tried disabling communication from open filer to specific esxi -- did not help

0 Kudos
lvaibhavt
Hot Shot
Hot Shot

I am running vSphere 7 GA in my lab

0 Kudos
depping
Leadership
Leadership

Just to be clear: when the connection fails, it will take 140 seconds before an APD is declared. When the APD is declared, it will take another 3 minutes normally before HA takes action. This is something people tend to forget, so in total a failover will take place after 5 (!) minutes.

You should check FDM.log to see why the failover did not occur, usually the master node should have the info you are looking for.

0 Kudos