Hi there,
We have 4 ESX hosts in a cluster. Recently we needed to remove one of the ESX hosts out from the cluster and so the virtual machines were vmotioned off that machine. The problem is that they were vmotioned 4 at a time. What happened was that the cluster could not assign new ram quickly enough to the guests as they were vmotioned and they resorted to using swap; I/O became intolerably slow and they needed a hard reset to clear the swap.
With 3 hosts the 'Memory Granted' metric is around 45GB out of 48GB on each host, so close to capacity; but the Memory Active component is only 12GB meaning that the balloon driver should have easily inflated in other guests and reallocated the RAM to the new guest. I know that many people oversubscribe their environment so what gives here?
All ESX are on the latest patch level and so are vmtools on each VM.