Exchange 2010 DAG failures with ESXi6 DRS vMotion

jt_cp · ‎03-01-2016

After upgrading our environment to ESXi6, we've been experiencing DAG failover at least once a week during DRS vMotion failover. However, if we manually perform the failover, there are no issues. The SameSubnetDelay has already been increase to 20 seconds from default of 5 seconds.

Anyone knows why? What's changed in ESXi6 that is causing this long timeout for DRS vMotion on Exchange 2010 DAG?

Hardware Platform:

10Gb DSwitch.

HP Proliant G8

HP 3Par with SSD/SAS

douglasarcidino · ‎03-01-2016

For the time being you should disable DRS on your DAG nodes to avoid the failures. It could very well be that you have some underlying performance issue that is causing the DRS migration to occur and causing your bandwidth problems. Meaning, whatever is causing stress that causes DRS to engage is probably causing the failure which is why you don't see it during a manual vMotion.

Doug

If you found this reply helpful, please mark as answer VCP-DCV 4/5/6 VCP-DTM 5/6

KevinTunge · ‎06-05-2017

Check the cluster sensitivity- http://www.vmware.com/files/pdf/using-vmware-HA-DRS-and-vmotion-with-exchange-2010-dags.pdf

Relavent section is 5.2 I believe. This seemed to work for us- at least, the databases don't flop anymore. Exchange guy said his indexes failed though- but they came back about 30 secs later.

See what you think.

KevinTunge · ‎06-05-2017

Whoops didn't read your whole post. That is odd indeed, if you already have the sensitivity relaxed.

All

Exchange 2010 DAG failures with ESXi6 DRS vMotion