Can you advise if you have an ESG appliance with HA ?
and is your request how to decrease the dead time ?
and please share the NSX version used.
It's 6.4.5, they appliances are in HA mode along with the DLR and 3 other ESG appliances that aren't having an issue.
The settings for OSPF are configured as recommended in the Perimeter-Gateway and DLR
Hello interval: 30
Dead interval: 120
The OneArm-LoadBalancers minus the OSPF configuration are exactly the same, nothing changed from defaults and do not experience the issue.
The Perimeter-Gateway will say it didn't receive any heartbeats and both appliances will show nothing for active/standby and file system changes to read-only and I have to reboot them.
DLR and OneArm-ESGs keep chugging along though and never have an issue.
I redeployed the appliances as well thinking maybe they were messed up but a controller failover on my SAN seems to always cause it, the failover takes less than a minute and the recommended timeouts are configured on all the ESXi hosts.
What do you mean by rebooting SAN ?
Are you rebooting the underlay storage Edges are using? In this case, you must enable vSphere HA APD to restart VMs or manually rebooting edge after storage are restored.
The best practice is turn off all the VMs, put ESXi in maintaince mode and then reboot SAN. What you're doing is quite dangerous, you might lose data or corrupt vms.