What could cause a 2-3 second disconnect when doing a UAG vmotion between hosts on the same network and datastore? This recently started happening after we moved from NSX-V to NSX-T and rebuilt the load balancer, so I really don't even know where to begin to troubleshoot it. In my experience I've never seen vmotion cause any disconnections ever on any VM, so I'm surprised to see this happening at all (regardless of NSX settings).
We are experiencing this same issue and opened a support case with VMWare. The conversation and responses are below, which I'm not too pleased about. We still have not gotten a resolution for this, but as of right now, no vMotion for UAG, I guess?
"It is not recommended by VMware to vMotion an UAG and during a vMotion of UAG it is an expected behavior for the active connections to drop."
"Is there documentation you can provide or point me to, explaining that vMotion is not recommended for UAG without dropping connections?"
"VMware QE team doesn’t test live vMotion of UAG so it is not officially supported (which is why we don’t see that in the official VMware UAG docs).
VMware product documentation states what we DO support. If we had stated somewhere that vMotion can be used with UAG and it then failed to work, that would be a legitimate support case. UAG is a virtual appliance so automatic redeploy on a different host is the right way to have a UAG appliance in a different location."
I have to admit, I'm sort of in disbelief that this was an official response from VMware. For them to not support one of the core technologies of their core product (ESXi) on an appliance they ship, and not least an appliance that isn't even particularly large when things like Virtual Centre, etc will all vMotion quite happily, is hard to accept.
I really hope the team behind the UAG are actively working on addressing this in a future release as suggesting deployment of another UAG onto another host as your 'live migration' solution is bordering on comical.
That seems like the position of one person at GSS, you should try to get to an actual engineering team. Escalate the case and get your Vmware rep and TAM involved to open a PR. If UAGs drop connections due to an out of order packet stream that would be new behavior for the product, and documentation would need to be updated. Don't expect this to be quick, think months of insisting and pushing. It is up to you to decide if the process is worth your time.